Composition for cleaving a target dna comprising a guide rna specific for the target dna and cas protein-encoding nucleic acid or cas protein, and use thereof

ABSTRACT

The present invention relates to targeted genome editing in eukaryotic cells or organisms. More particularly, the present invention relates to a composition for cleaving a target DNA in eukaryotic cells or organisms comprising a guide RNA specific for the target DNA and Cas protein-encoding nucleic acid or Cas protein, and use thereof.

CROSS REFERENCE TO RELATED APPLICATIONS

The present application is a continuation application of U.S.application Ser. No. 17/004,338 filed Aug. 27, 2020, which is acontinuation application of U.S. application Ser. No. 14/685,568 filedApr. 13, 2015, which is a continuation of PCT/KR2013/009488 filed Oct.23, 2013, which claims priority to U.S. Provisional Application No.61/837,481 filed on Jun. 20, 2013, U.S. Provisional Application No.61/803,599 filed Mar. 20, 2013, and U.S. Provisional Application No.61/717,324 filed Oct. 23, 2012, the entire contents of eachaforementioned application are incorporated herein by reference.

SEQUENCE LISTING

The instant application contains a Sequence Listing which has beensubmitted electronically in XML file format and is hereby incorporatedby reference in its entirety. Said XML copy, created on Aug. 8, 2023, isnamed 00074-01005 SL.xml and is 495,414 bytes in size.

TECHNICAL FIELD

The present invention relates to targeted genome editing in eukaryoticcells or organisms. More particularly, the present invention relates toa composition for cleaving a target DNA in eukaryotic cells or organismscomprising a guide RNA specific for the target DNA and Casprotein-encoding nucleic acid or Cas protein, and use thereof.

BACKGROUND ART

CRISPRs (Clustered Regularly Interspaced Short Palindromic Repeats) areloci containing multiple short direct repeats that are found in thegenomes of approximately 40% of sequenced bacteria and 90% of sequencedarchaea. CRISPR functions as a prokaryotic immune system, in that itconfers resistance to exogenous genetic elements such as plasmids andphages. The CRISPR system provides a form of acquired immunity. Shortsegments of foreign DNA, called spacers, are incorporated into thegenome between CRISPR repeats, and serve as a memory of past exposures.CRISPR spacers are then used to recognize and silence exogenous geneticelements in a manner analogous to RNAi in eukaryotic organisms.

Cas9, an essential protein component in the Type II CRISPR/Cas system,forms an active endonuclease when complexed with two RNAs termed CRISPRRNA (crRNA) and trans-activating crRNA (tracrRNA), thereby slicingforeign genetic elements in invading phages or plasmids to protect thehost cells. crRNA is transcribed from the CRISPR element in the hostgenome, which was previously captured from such foreign invaders.Recently, Jinek et al. (1) demonstrated that a single-chain chimeric RNAproduced by fusing an essential portion of crRNA and tracrRNA couldreplace the two RNAs in the Cas9/RNA complex to form a functionalendonuclease.

CRISPR/Cas systems offer an advantage to zinc finger and transcriptionactivator-like effector DNA-binding proteins, as the site specificity innucleotide binding CRISPR-Cas proteins is governed by a RNA moleculeinstead of the DNA-binding protein, which can be more challenging todesign and synthesize.

However, until now, a genome editing method using the RNA-guidedendonuclease (RGEN) based on CRISPR/Cas system has not been developed.

Meanwhile, Restriction fragment length polymorphism (RFLP) is one of theoldest, most convenient, and least expensive methods of genotyping thatis still used widely in molecular biology and genetics but is oftenlimited by the lack of appropriate sites recognized by restrictionendonucleases.

Engineered nuclease-induced mutations are detected by various methods,which include mismatch-sensitive T7 endonuclease I (T7E1) or Surveyornuclease assays, RFLP, capillary electrophoresis of fluorescent PCRproducts, Dideoxy sequencing, and deep sequencing. The T7E1 and Surveyorassays are widely used but are cumbersome. Furthermore, these enzymestend to underestimate mutation frequencies because mutant sequences canform homoduplexes with each other and cannot distinguish homozygousbi-allelic mutant clones from wildtype cells. RFLP is free of theselimitations and therefore is a method of choice. Indeed, RFLP was one ofthe first methods to detect engineered nuclease-mediated mutations incells and animals. Unfortunately, however, RFLP is limited by theavailability of appropriate restriction sites. It is possible that norestriction sites are available at the target site of interest.

DISCLOSURE OF INVENTION Technical Problem

Until now, a genome editing and genotyping method using the RNA-guidedendonuclease (RGEN) based on CRISPR/Cas system has not been developed.

Under these circumstances, the present inventors have made many effortsto develop a genome editing method based on CRISPR/Cas system andfinally established a programmable RNA-guided endonuclease that cleaveDNA in a targeted manner in eukaryotic cells and organisms.

In addition, the present inventors have made many efforts to develop anovel method of using RNA-guided endonucleases (RGENs) in RFLP analysis.They have used RGENs to genotype recurrent mutations found in cancer andthose induced in cells and organisms by engineered nucleases includingRGENs themselves, thereby completing the present invention.

Solution to Problem

It is an object of the present invention to provide a composition forcleaving target DNA in eukaryotic cells or organisms comprising a guideRNA specific for target DNA or DNA that encodes the guide RNA, and Casprotein-encoding nucleic acid or Cas protein.

It is another object of the present invention to provide a compositionfor inducing targeted mutagenesis in eukaryotic cells or organisms,comprising a guide RNA specific for target DNA or DNA that encodes theguide RNA, and Cas protein-encoding nucleic acid or Cas protein.

It is still another object of the present invention to provide a kit forcleaving a target DNA in eukaryotic cells or organisms comprising aguide RNA specific for target DNA or DNA that encodes the guide RNA, andCas protein-encoding nucleic acid or Cas protein.

It is still another object of the present invention to provide a kit forinducing targeted mutagenesis in eukaryotic cells or organismscomprising a guide RNA specific for target DNA or DNA that encodes theguide RNA, and Cas protein-encoding nucleic acid or Cas protein.

It is still another object of the present invention to provide a methodfor preparing a eukaryotic cell or organism comprising Cas protein and aguide RNA comprising a step of co-transfecting or serial-transfectingthe eukaryotic cell or organism with a Cas protein-encoding nucleic acidor Cas protein, and a guide RNA or DNA that encodes the guide RNA.

It is still another object of the present invention to provide aeukaryotic cell or organism comprising a guide RNA specific for targetDNA or DNA that encodes the guide RNA, and Cas protein-encoding nucleicacid or Cas protein.

It is still another object of the present invention to provide a methodfor cleaving a target DNA in eukaryotic cells or organisms comprising astep of transfecting the eukaryotic cells or organisms comprising atarget DNA with a composition comprising a guide RNA specific for targetDNA or DNA that encodes the guide RNA, and Cas protein-encoding nucleicacid or Cas protein.

It is still another object of the present invention to provide a methodfor inducing targeted mutagenesis in a eukaryotic cell or organismcomprising a step of treating a eukaryotic cell or organism with acomposition comprising a guide RNA specific for target DNA or DNA thatencodes the guide RNA, and Cas protein-encoding nucleic acid or Casprotein.

It is still another object of the present invention to provide anembryo, a genome-modified animal, or genome-modified plant comprising agenome edited by a composition comprising a guide RNA specific fortarget DNA or DNA that encodes the guide RNA, and Cas protein-encodingnucleic acid or Cas protein.

It is still another object of the present invention to provide a methodof preparing a genome-modified animal comprising a step of introducingthe composition comprising a guide RNA specific for target DNA or DNAthat encodes the guide RNA, and Cas protein-encoding nucleic acid or Casprotein into an embryo of an animal; and a step of transferring theembryo into a oviduct of pseudopregnant foster mother to produce agenome-modified animal.

It is still another object of the present invention to provide acomposition for genotyping mutations or variations in an isolatedbiological sample, comprising a guide RNA specific for the target DNAsequence Cas protein.

It is still another object of the present invention to provide a methodof using a RNA-guided endonuclease (RGEN) to genotype mutations inducedby engineered nucleases in cells or naturally-occurring mutations orvariations, wherein the RGEN comprises a guide RNA specific for targetDNA and Cas protein.

It is still another object of the present invention to provide a kit forgenotyping mutations induced by engineered nucleases in cells ornaturally-occurring mutations or variations, comprising a RNA-guidedendonuclease (RGEN), wherein the RGEN comprises a guide RNA specific fortarget DNA and Cas protein.

It is an object of the present invention to provide a composition forcleaving target DNA in eukaryotic cells or organisms comprising a guideRNA specific for target DNA or DNA that encodes the guide RNA, and Casprotein-encoding nucleic acid or Cas protein.

It is another object of the present invention to provide a compositionfor inducing targeted mutagenesis in eukaryotic cells or organisms,comprising a guide RNA specific for target DNA or DNA that encodes theguide RNA, and Cas protein-encoding nucleic acid or Cas protein.

It is still another object of the present invention to provide a kit forcleaving a target DNA in eukaryotic cells or organisms comprising aguide RNA specific for target DNA or DNA that encodes the guide RNA, andCas protein-encoding nucleic acid or Cas protein.

It is still another object of the present invention to provide a kit forinducing targeted mutagenesis in eukaryotic cells or organismscomprising a guide RNA specific for target DNA or DNA that encodes theguide RNA, and Cas protein-encoding nucleic acid or Cas protein.

It is still another object of the present invention to provide a methodfor preparing a eukaryotic cell or organism comprising Cas protein and aguide RNA comprising a step of co-transfecting or serial-transfectingthe eukaryotic cell or organism with a Cas protein-encoding nucleic acidor Cas protein, and a guide RNA or DNA that encodes the guide RNA.

It is still another object of the present invention to provide aeukaryotic cell or organism comprising a guide RNA specific for targetDNA or DNA that encodes the guide RNA, and Cas protein-encoding nucleicacid or Cas protein.

It is still another object of the present invention to provide a methodfor cleaving a target DNA in eukaryotic cells or organisms comprising astep of transfecting the eukaryotic cells or organisms comprising atarget DNA with a composition comprising a guide RNA specific for targetDNA or DNA that encodes the guide RNA, and Cas protein-encoding nucleicacid or Cas protein.

It is still another object of the present invention to provide a methodfor inducing targeted mutagenesis in a eukaryotic cell or organismcomprising a step of treating a eukaryotic cell or organism with acomposition comprising a guide RNA specific for target DNA or DNA thatencodes the guide RNA, and Cas protein-encoding nucleic acid or Casprotein.

It is still another object of the present invention to provide anembryo, a genome-modified animal, or genome-modified plant comprising agenome edited by a composition comprising a guide RNA specific fortarget DNA or DNA that encodes the guide RNA, and Cas protein-encodingnucleic acid or Cas protein.

It is still another object of the present invention to provide a methodof preparing a genome-modified animal comprising a step of introducingthe composition comprising a guide RNA specific for target DNA or DNAthat encodes the guide RNA, and Cas protein-encoding nucleic acid or Casprotein into an embryo of an animal; and a step of transferring theembryo into a oviduct of pseudopregnant foster mother to produce agenome-modified animal.

It is still another object of the present invention to provide acomposition for genotyping mutations or variations in an isolatedbiological sample, comprising a guide RNA specific for the target DNAsequence Cas protein.

It is still another object of the present invention to provide acomposition for genotyping nucleic acid sequences in pathogenicmicroorganisms in an isolated biological sample, comprising a guide RNAspecific for the target DNA sequence and Cas protein.

It is still another object of the present invention to provide a kit forgenotyping mutations or variations in an isolated biological sample,comprising the composition, specifically comprising a RNA-guidedendonuclease (RGEN), wherein the RGEN comprises a guide RNA specific fortarget DNA and Cas protein.

It is still another object of the present invention to provide a methodof genotyping mutations or variations in an isolated biological sample,using the composition, specifically comprising a RNA-guided endonuclease(RGEN), wherein the RGEN comprises a guide RNA specific for target DNAand Cas protein.

Advantageous Effects of Invention

The present composition for cleaving a target DNA or inducing a targetedmutagenesis in eukaryotic cells or organisms, comprising a guide RNAspecific for the target DNA and Cas protein-encoding nucleic acid or Casprotein, the kit comprising the composition, and the method for inducingtargeted mutagenesis provide a new convenient genome editing tools. Inaddition, because custom RGENs can be designed to target any DNAsequence, almost any single nucleotide polymorphism or smallinsertion/deletion (indel) can be analyzed via RGEN-mediated RFLP,therefore, the composition and method of the present invention may beused in detection and cleaving naturally-occurring variations andmutations.

BRIEF DESCRIPTION OF DRAWINGS

FIGS. 1A and 1B show Cas9-catalyzed cleavage of plasmid DNA in vitro.FIG. 1A: Schematic representation of target DNA (SEQ ID NO: 112) andchimeric RNA sequences (SEQ ID NO: 113). Triangles indicate cleavagesites. The PAM sequence recognized by Cas9 is shown in bold. Thesequences in the guide RNA (SEQ ID NO: 113) derived from crRNA andtracrRNA are shown in box and underlined, respectively. FIG. 1B: Invitro cleavage of plasmid DNA by Cas9. An intact circular plasmid orApaLI-digested plasmid was incubated with Cas9 and guide RNA.

FIGS. 2A and 2B show Cas9-induced mutagenesis at an episomal targetsite. FIG. 2A: Schematic overview of cell-based assays using a RFP-GFPreporter. GFP is not expressed from this reporter because the GFPsequence is fused to the RFP sequence out-of-frame. The RFP-GFP fusionprotein is expressed only when the target site between the two sequencesis cleaved by a site-specific nuclease. FIG. 2B: Flow cytometry of cellstransfected with Cas9. The percentage of cells that express the RFP-GFPfusion protein is indicated.

FIGS. 3A and 3B show RGEN-driven mutations at endogenous chromosomalsites. FIG. 3A: CCR5 locus. FIG. 3B: C4BPB locus. (Top) The T7E1 assaywas used to detect RGEN-driven mutations. Arrows indicate the expectedposition of DNA bands cleaved by T7E1. Mutation frequencies (Indels (%))were calculated by measuring the band intensities. (Bottom) DNAsequences of the wild-type (WT) CCR5 (SEQ ID NO: 114) and C4BPB (SEQ IDNO: 122) and mutant clones. DNA sequences of RGEN-induced mutations atthe CCR5 locus: +1 (SEQ ID NO: 115), −13 (SEQ ID NO: 116), −14 (SEQ IDNO: 117), −18 (SEQ ID NO: 118), −19 (SEQ ID NO: 119), −24 (SEQ ID NO:120), and −30 (SEQ ID NO: 121). DNA sequences of RGEN-induced mutationsat the C4BPB locus: +1 (SEQ ID NO: 122), +2 (SEQ ID NO: 123), −30 (SEQID NO: 125), and −180 (SEQ ID NO: 126). The region of the targetsequence complementary to the guide RNA is shown in box. The PAMsequence is shown in bold. Triangles indicate the cleavage site. Basescorresponding to microhomologies are underlined. The column on the rightindicates the number of inserted or deleted bases.

FIGS. 4A, 4B, and 4C show that RGEN-driven off-target mutations areundetectable. FIG. 4A: On-target and potential off-target sequences. Thehuman genome was searched in silico for potential off-target sites. Foursites were identified, ADCY5 (SEQ ID NO: 128), KCNJ6 (SEQ ID NO: 129),CNTNAP2 (SEQ ID NO: 130), and Chr. 5 N/A (SEQ ID NO: 131), each of whichcarries 3-base mismatches with the CCR5 on-target (SEQ ID NO: 127).Mismatched bases are underlined. FIG. 4B: The T7E1 assay was used toinvestigate whether these sites were mutated in cells transfected withthe Cas9/RNA complex. No mutations were detected at these sites. N/A(not applicable), an intergenic site. FIG. 4C: Cas9 did not induceoff-target-associated chromosomal deletions. The CCR5-specific RGEN andZFN were expressed in human cells. PCR was used to detect the inductionof the 15-kb chromosomal deletions in these cells.

FIGS. 5A, 5B, 5C, and 5D show RGEN-induced Foxn1 gene targeting in mice.FIG. 5A: A schematic diagram depicting target DNA (SEQ ID NO: 132) and asgRNA specific to exon 2 of the mouse Foxn1 gene (SEQ ID NO: 133). PAMin exon 2 is shown in a box and the sequence in the sgRNA that iscomplementary to exon 2 is underlined. Triangles indicate cleavagesites. FIG. 5B: Representative T7E1 assays demonstrating gene-targetingefficiencies of Cas9 mRNA plus Foxn1-specific sgRNA that were deliveredvia intra-cytoplasmic injection into one-cell stage mouse embryos.Numbers indicate independent founder mice generated from the highestdose. Arrows indicate bands cleaved by T7E1. FIG. 5C: DNA sequences ofwild-type (WT) Foxn1 (SEQ ID NO: 134) and mutant alleles (SEQ ID NOs.135-141) observed in three Foxn1 mutant founders identified in FIG. 5B.DNA sequences of mutant alleles in founder #108: −44 (SEQ ID NO: 135),−23 (SEQ ID NO: 136), −17 (SEQ ID NO: 137), and +1 (SEQ ID NO: 138). DNAsequences of mutant alleles in founder #111: +1 (SEQ ID NO: 138) and −11(SEQ ID NO: 139). DNA sequences of mutant alleles in founder #114: −6(SEQ ID NO: 140), −17 (SEQ ID NO: 137), and −8 (SEQ ID NO: 141). Thenumber of occurrences is shown in parentheses.

FIG. 5D: PCR genotyping of F1 progenies derived from crossing Foxn1founder #108 and wild-type FVB/NTac. Note the segregation of the mutantalleles found in Foxn1 founder #108 in the progenies.

FIGS. 6A, 6B, and 6C show Foxn1 gene targeting in mouse embryos byintra-cytoplasmic injection of Cas9 mRNA and Foxn1-sgRNA. FIG. 6A: Arepresentative result of a T7E1 assay monitoring the mutation rate afterinjecting the highest dose. Arrows indicate bands cleaved by T7E1. FIG.6B: A summary of T7E1 assay results. Mutant fractions among in vitrocultivated embryos obtained after intra-cytoplasmic injection of theindicated RGEN doses are indicated. FIG. 6C: DNA sequences of wild-type(WT) Foxn1 (SEQ ID NO: 143) and Foxn1 mutant alleles (SEQ ID Nos.144-152) identified from a subset of T7E1-positive mutant embryos. TheDNA sequences of the mutant alleles are: Δ11 (SEQ ID NO: 144), Δ11+Δ17(SEQ ID NO: 145) Δ57 (SEQ ID NO: 146), Δ17 (SEQ ID NO: 147), +1 (SEQ IDNO: 148), Δ12 (SEQ ID NO: 149, Δ72 (SEQ ID NO: 150), Δ25 (SEQ IDNO:151), Δ24 (SEQ ID NO: 152). The target sequence of the wild-typeallele is denoted in box.

FIGS. 7A, 7B, and 7C show Foxn1 gene targeting in mouse embryos usingthe recombinant Cas9 protein: Foxn1-sgRNA complex. FIG. 7A and FIG. 7Bare representative T7E1 assays results and their summaries. Embryos werecultivated in vitro after they underwent pronuclear (FIG. 7A) orintra-cytoplasmic injection (FIG. 7B). Underlined numbers indicateT7E1-positive mutant founder mice. FIG. 7C: DNA sequences of wild-type(WT) Foxn1 (SEQ ID NO: 153) and Foxn1 mutant alleles (SEQ ID NOs.154-166) identified from the in vitro cultivated embryos that wereobtained by the pronucleus injection of recombinant Cas9 protein:Foxn1-sgRNA complex at the highest dose. The target sequence of thewild-type allele is denoted in box. The DNA sequences of the mutantalleles are: Δ18 (SEQ ID NO: 154), Δ20 (SEQ ID NO: 155), Δ19 (SEQ ID NO:156), Δ17 (SEQ ID NO: 157), All (SEQ ID NO: 158), Δ3+1 (SEQ ID NO: 159),Δ2 (SEQ ID NO: 160), +1, Embryo 1 (SEQ ID NO: 161), +1, Embryo 10 (SEQID NO: 162), Δ6 (SEQ ID NO: 163), Δ5 (SEQ ID NO: 164), Δ28 (SEQ ID NO:165), and Δ126 (SEQ ID NO: 166).

FIGS. 8A, 8B, and 8C show Germ-line transmission of the mutant allelesfound in Foxn1 mutant founder #12. FIG. 8A: wild type fPCR analysis.FIG. 8B: Foxn1 mutant founder #12 fPCR analysis. FIG. 8C: PCR genotypingof wild-type FVB/NTac, the founder mouse, and their F1 progenies.

FIGS. 9A and 9B show Genotypes of embryos generated by crossing Prkdcmutant founders. Prkdc mutant founders ♂25 and ♀15 were crossed andE13.5 embryos were isolated. FIG. 9A: fPCR analysis of wild-type,founder ♂25, and founder ♀15. Note that, due to the technicallimitations of fPCR analysis, these results showed small differencesfrom the precise sequences of the mutant alleles; e.g., from thesequence analysis, Δ269/Δ61/WT and Δ5+1/+7/+12/WT were identified infounders ♂25 and ♀15, respectively. FIG. 9B: Genotypes of the generatedembryos.

FIGS. 10A, 10B, 10C, 10D, and 10E show Cas9protein/sgRNA complex inducedtargeted mutation at CCR5 gene (FIGS. 10A-10C) and ABCC11 gene (FIGS.10D-10E). FIG. 10A: Results of a T7E1 assay monitoring the mutation rateat CCR5 locus after introducing Cas9 protein and sgRNA or Cas9 proteinand crRNA+tracrRNA into K562 cells. FIG. 10B: Results of a T7E1 assayusing ⅕ scaled down doses of Cas9 protein and sgRNA. FIG. 10C: Wild-type(WT) CCR5 sequence (SEQ ID NO: 114) and Cas protein induced mutantsequences (SEQ ID NOs. 167-171 and 115) identified in CCR5 locus. TheDNA sequences of the mutant sequences are: −4 (SEQ ID NO: 167), −4 (SEQID NO: 168), −7 (SEQ ID NO: 169), −1 (SEQ ID NO: 170), +1 (SEQ ID NO:115), and −17, +1 (SEQ ID NO: 171). FIG. 10D: Results of a T7E1 assaymonitoring the mutation rate at ABCC11 locus after introducing Cas9protein and sgRNA into K562 cells. FIG. 10E: Wild-type (WT) ABCC11sequence (SEQ ID NO: 172) and Cas9 protein induced mutant sequences (SEQID NOs. 173-176) identified in ABCC11 locus. The DNA sequences of themutant sequences are: −6 (SEQ ID NO: 173), −3 (SEQ ID NO: 174), −29 (SEQID NO: 175), −20 (SEQ ID NO: 176), and −256 (TTCTC).

FIG. 11 shows recombinant Cas9 protein-induced mutations in Arabidopsisprotoplasts.

FIG. 12 shows wild type BRI1 sequence (SEQ ID NO: 177) and recombinantCas9 protein-induced mutant sequences (SEQ ID NOs. 178-181) in theArabidopsis BRI1 gene. The DNA sequences of the mutant sequences are: −7(SEQ ID NO: 178), −224 (SEQ ID NO: 179), −223 (SEQ ID NO: 180), and−223, +62 (SEQ ID NO: 181).

FIG. 13 shows 17E1 assay showing endogenous CCR5 gene disruption in 293cells by treatment of Cas9-mal-9R4L and sgRNA/C9R4LC complex.

FIGS. 14A and 14B show mutation frequencies at on-target and off-targetsites of RGENs reported in Fu et al. (2013). T7E1 assays analyzinggenomic DNA from K562 cells (R) transfected serially with 20 μg ofCas9-encoding plasmid and with 60 μg and 120 μg of in vitro transcribedGX19 crRNA and tracrRNA, respectively (1×10⁶ cells), or (D)co-transfected with 1 μg of Cas9-encoding plasmid and 1 μg of GX19 sgRNAexpression plasmid (2×10⁵ cells). FIG. 14A: VEGFA site 1 on targetsequence (SEQ ID NO: 182) and off target sequences, OT1-3 (SEQ ID NO:183) and OT1-11 (SEQ ID NO: 184). VEGFA site 2 on target sequence (SEQID NO: 185) and off target sequences OT2-1 (SEQ ID NO: 186), OT2-9 (SEQID NO: 187) and OT2-24 (SEQ ID NO: 188). FIG. 14B: VEGFA site 3 ontarget sequence (SEQ ID NO: 189) and off target sequence OT3-18 (SEQ IDNO: 190) and EMX1 on target sequence (SEQ ID NO: 191) and off targetsequence OT4-1 (SEQ ID NO: 192).

FIGS. 15A and 15B show comparison of guide RNA structure. Mutationfrequencies of the RGENs reported in Fu et al. (2013) were measured aton-target and off-target sites using the T7E1 assay. K562 cells wereco-transfected with the Cas9-encoding plasmid and the plasmid encodingGX19 sgRNA or GGX20 sgRNA. Off-target sites (OT1-3 etc.) are labeled asin Fu et al. (2013). FIG. 15A: VEGFA site 1 on target sequence (SEQ IDNO: 182) and off target sequences OT1-3 (SEQ ID NO: 183 and OT1-11 (SEQID NO: 184). VEGFA site 2 on target sequence (SEQ ID NO: 185) and offtarget sequences OT2-1 (SEQ ID NO: 186), OT2-9 (SEQ ID NO: 187), andOT2-24 (SEQ ID NO: 188). FIG. 15B: VEGFA site 3 on target sequence (SEQID NO: 189) and off target sequence OT3-18 (SEQ ID NO: 190) and EMX1 ontarget sequence (SEQ ID NO: 191) and off target sequence OT4-1 (SEQ IDNO: 192).

FIGS. 16A, 16B, 16C, and 16D show that in vitro DNA cleavage by Cas9nickases. FIG. 16A: Schematic overview of the Cas9 nuclease and thepaired Cas9 nickase. The PAM sequences and cleavage sites are shown inbox. FIG. 16B: Target sites in the human AAVS1 locus. The position ofeach target site is shown in triangle. FIG. 16C: Schematic overview ofDNA cleavage reactions. FAM dyes (shown in box) were linked to both 5′ends of the DNA substrate. FIG. 16D: DSBs and SSBs analyzed usingfluorescent capillary electrophoresis. Fluorescently-labeled DNAsubstrates were incubated with Cas9 nucleases or nickases beforeelectrophoresis.

FIGS. 17A and 17B show comparison of Cas9 nuclease and nickase behavior.FIG. 17A: On-target mutation frequencies associated with Cas9 nucleases(WT), nickases (D10A), and paired nickases at the following targetsequences of the AAVS1 locus: S1 (SEQ ID NO: 193, S2 (SEQ ID NO: 194),S3 (SEQ ID NO: 195), S4 (SEQ ID NO: 196), S5 (SEQ ID NO: 197), S6 (SEQID NO: 198), AS1 (SEQ ID NO: 199), AS2 (SEQ ID NO: 200), and AS3 (SEQ IDNO: 201). Paired nickases that would produce 5′ overhangs or 3′overhangs are indicated. FIG. 17B: Analysis of off-target effects ofCas9 nucleases and paired nickases. A total of seven potentialoff-target sites (SEQ ID NOs. 202-208) for three sgRNAs were analyzed.The mutation frequency for the S2 on-target sequence (SEQ ID NO: 194)was compared to the off-target sequences, S2 Off-1 (SEQ ID NO: 202) andS2 Off-2 (SEQ ID NO: 203). The mutation frequency for the S3 on-targetsequence (SEQ ID NO: 195) was compared to the off-target sequences, S3Off-1 (SEQ ID NO: 204) and S3 Off-2 (SEQ ID NO: 205). The mutationfrequency for the AS2 on-target sequence (SEQ ID NO: 198) was comparedto the off-target sequences, AS2 Off-1 (SEQ ID NO: 206), AS2 Off-6 (SEQID NO: 207), and AS2 Off-9 (SEQ ID NO: 208).

FIGS. 18A, 18B, 18C, and 18D show paired Cas9 nickases tested at otherendogenous human loci. The sgRNA target sites at the human CCR5 locus(FIG. 18A; SEQ ID NO: 209) and the BRCA2 locus (FIG. 18C; SEQ ID NO:210). PAM sequences are indicated in a box. Genome editing activities atCCR5 (FIG. 18B) and BRCA2 (FIG. 18D) target sites were detected by theT7E1 assay. The repair of two nicks that would produce 5′ overhangs ledto the formation of indels much more frequently than did those producing3′ overhangs.

FIGS. 19A and 19B show that paired Cas9 nickases mediate homologousrecombination. FIG. 19A: Strategy to detect homologous recombination.Donor DNA included an XbaI restriction enzyme site between two homologyarms, whereas the endogenous target site lacked this site. A PCR assaywas used to detect sequences that had undergone homologousrecombination. To prevent amplification of contaminating donor DNA,primers specific to genomic DNA were used. FIG. 19B: Efficiency ofhomologous recombination. Only amplicons of a region in which homologousrecombination had occurred could be digested with XbaI; the intensitiesof the cleavage bands were used to measure the efficiency of thismethod.

FIGS. 20A, 20B, 20C, and 20D show DNA splicing induced by paired Cas9nickases. FIG. 20A: The target sites of paired nickases in the humanAAVS1 locus. The distances between the AS2 site and each of the othersites are shown. Arrows indicate PCR primers. FIG. 20B: Genomicdeletions detected using PCR. Asterisks indicate deletion-specific PCRproducts. FIG. 20C: DNA sequences of wild-type (WT) (SEQ ID NO: 211 and332) and the following deletion-specific PCR products (SEQ ID Nos.212-218) obtained using AS2 sgRNAs or deletion-specific PCR products(SEQ ID NOs. 219-224) using L1 sgRNAs Target site PAM sequences areshown in box and sgRNA-matching sequences are shown in capital letters.Intact sgRNA-matching sequences are underlined. FIG. 20D: A schematicmodel of paired Cas9 nickase-mediated chromosomal deletions.Newly-synthesized DNA strands are shown in box.

FIGS. 21A, 21B, and 21C show that paired Cas9 nickases do not inducetranslocations. FIG. 21A: Schematic overview of chromosomaltranslocations between the on-target and off-target sites. FIG. 21B: PCRamplification to detect chromosomal translocations. FIG. 21C:Translocations induced by Cas9 nucleases but not by the nickase pair.

FIGS. 22A and 22B show a conceptual diagram of the T7E1 and RFLP assays.FIG. 22A: Comparison of assay cleavage reactions in four possiblescenarios after engineered nuclease treatment in a diploid cell: (A)wildtype, (B) a monoallelic mutation, (C) different biallelic mutations(hetero), and (D) identical biallelic mutations (homo). Black linesrepresent PCR products derived from each allele; dashed and dotted boxesindicate insertion/deletion mutations generated by NHEJ. FIG. 22B:Expected results of T7E1 and RGEN digestion resolved by electrophoresis.

FIG. 23 shows in vitro cleavage assay of a linearized plasmid containingthe C4BPB target site bearing indels. DNA sequences of individualplasmid substrates (upper panel): WT (SEQ ID NO: 104), I1 (SEQ ID NO:225), 12 (SEQ ID NO: 226), 13 (SEQ ID NO: 227), D1 (SEQ ID NO: 228), D2(SEQ ID NO: 229), and D3 (SEQ ID NO: 230). The PAM sequence isunderlined. Inserted bases are shown in box. Arrows (bottom panel)indicate expected positions of DNA bands cleaved by thewild-type-specific RGEN after electrophoresis.

FIGS. 24A and 24B show genotyping of mutations induced by engineerednucleases in cells via RGEN-mediated RFLP. FIG. 24A: Genotype of C4BPBwild type (SEQ ID NO: 231) and the following mutant K562 cell clones: +3(SEQ ID NO: 232, −12 (SEQ ID NO: 233), −9 (SEQ ID NO: 234), −8 (SEQ IDNO: 235), −36 (SEQ ID NO: 236), +1 (SEQ ID NO: 237), +1 (SEQ ID NO:238), +67 (SEQ ID NO: 239), −7, +1 (SEQ ID NO: 240), −94 (SEQ ID NO:241). FIG. 24B: Comparison of the mismatch-sensitive T7E1 assay withRGEN-mediated RFLP analysis. Black arrows indicate the cleavage productby treatment of T7E1 enzyme or RGENs.

FIGS. 25A, 25B, and 25C show genotyping of RGEN-induced mutations viathe RGEN-RFLP technique. FIG. 25A: Analysis of C4BPB-disrupted clonesusing RGEN-RFLP and T7E1 assays. Arrows indicate expected positions ofDNA bands cleaved by RGEN or T7E1. FIG. 25B: Quantitative comparison ofRGEN-RFLP analysis with T7E1 assays. Genomic DNA samples from wild-typeand C4BPB-disrupted K562 cells were mixed in various ratios andsubjected to PCR amplification. FIG. 25C: Genotyping of RGEN-inducedmutations in the HLA-B gene in HeLa cells with RFLP and T7E1 analyses.

FIGS. 26A and 26B show genotyping of mutations induced by engineerednucleases in organisms via RGEN-mediated RFLP. FIG. 26A: Genotype ofPibf1 wild-type (WT) (SEQ ID NO: 242) and the following mutant foundermice: #1 (SEQ ID NO: 243 and SEQ ID NO: 244), #3 (SEQ ID NO: 245 and SEQID NO: 246), #4 (SEQ ID NO: 247 and SEQ ID NO: 242), #5 (SEQ ID NO: 246and SEQ ID NO: 242), #6 (SEQ ID NO: 248 and SEQ ID NO: 249), #8 (SEQ IDNO: 250 and SEQ ID NO: 251), and #11 (SEQ ID NO: 252 and SEQ ID NO:250). FIG. 26B: Comparison of the mismatch-sensitive T7E1 assay withRGEN-mediated RFLP analysis. Black arrows indicate the cleavage productby treatment of T7E1 enzyme or RGENs.

FIG. 27 shows RGEN-mediated genotyping of ZFN-induced mutations at awild-type CCR5 sequence (SEQ ID NO: 253). The ZFN target site is shownin box. Black arrows indicate DNA bands cleaved by T7E1.

FIG. 28 shows polymorphic sites in a region of the human HLA-B gene (SEQID NO: 254). The sequence, which surrounds the RGEN target site, is thatof a PCR amplicon from HeLa cells. Polymorphic positions are shown inbox. The RGEN target site and the PAM sequence are shown in dashed andbolded box, respectively. Primer sequences are underlined.

FIGS. 29A and 29B show genotyping of oncogenic mutations via RGEN-RFLPanalysis. FIG. 29A: A recurrent mutation (c.133-135 deletion of TCT; SEQID NO: 256) in the human CTNNB1 gene in HCT116 cells was detected byRGENs. The wild-type CTNNB1 sequence is represented by SEQ ID NO: 255.HeLa cells were used as a negative control. FIG. 29B: Genotyping of theKRAS substitution mutation (c.34 G>A) in the Δ549 cancer cell line withRGENs that contain mismatched guide RNA that are WT-specific (SEQ ID NO:257) or mutant-specific (SEQ ID NO: 258). Mismatched nucleotides areshown in box. HeLa cells were used as a negative control. Arrowsindicate DNA bands cleaved by RGENs. DNA sequences confirmed by Sangersequencing are shown: wild-type (SEQ ID NO: 259) and c.34G>A (SEQ ID NO:260).

FIGS. 30A, 30B, 30C, and 30D show genotyping of the CCR5 delta32 allelein HEK293T cells via RGEN-RFLP analysis. FIG. 30A: RGEN-RFLP assays ofcell lines. DNA sequences of the wild-type CCR5 locus (SEQ ID NO: 262)and delta 32 mutation (SEQ ID NO: 261) are shown. K562, SKBR3, and HeLacells were used as wild-type controls. Arrows indicate DNA bands cleavedby RGENs. FIG. 30B: DNA sequence of wild-type (SEQ ID NO: 263) anddelta32 CCR5 alleles (SEQ ID NO: 264). Both on-target and off-targetsites of RGENs used in RFLP analysis are underlined. A single-nucleotidemismatch between the two sites is shown in box. The PAM sequence isunderlined. FIG. 30C: In vitro cleavage of plasmids harboring WT ordel32 CCR5 alleles using the wild-type-specific RGEN. FIG. 30DConfirming the presence of an off-target site of theCCR5-delta32-specific RGEN at the CCR5 locus. In vitro cleavage assaysof plasmids harboring either on-target (SEQ ID NO: 265) or off-targetsequences (SEQ ID NO: 266) using various amounts of the del32-specificRGEN.

FIGS. 31A and 31B show genotyping of a KRAS point mutation (c.34G>A).FIG. 31A: RGEN-RFLP analysis of the KRAS mutation (c.34 G>A) in cancercell lines. PCR products from HeLa cells (used as a wild-type control)or Δ549 cells, which are homozygous for the point mutation, weredigested with RGENs with perfectly matched crRNA specific to thewild-type sequence (SEQ ID NO: 259) or the mutant sequence (SEQ ID NO:260). KRAS genotypes in these cells were confirmed by Sanger sequencing.FIG. 31B: Plasmids harboring either the wild-type (SEQ ID NO: 259) ormutant KRAS sequences (SEQ ID NO: 260) were digested using RGENs withperfectly matched crRNAs or attenuated, one-base mismatched crRNAs: m7(SEQ ID NO: 267), m6 (SEQ ID NO: 257), m5 (SEQ ID NO: 268), m4 (SEQ IDNO: 269), m8 (SEQ ID NO: 260), m7,8 (SEQ ID NO: 270), m6,8 (SEQ ID NO:258), m5,8 (SEQ ID NO: 271), and m4,8 (SEQ ID NO: 272). AttenuatedcrRNAs that were chosen for genotyping are labeled in box above thegels.

FIGS. 32A and 32B show genotyping of a PIK3CA point mutation (c.3140A>G). FIG. 32A: RGEN-RFLP analysis of the PIK3CA mutation (c.3140 A>G)in cancer cell lines. PCR products from HeLa cells (used as a wild-typecontrol) or HCT116 cells that are heterozygous for the point mutationwere digested with RGENs with perfectly matched crRNA specific to thewild-type sequence (SEQ ID NO: 273) or the mutant sequence (SEQ ID NO:274). PIK3CA genotypes in these cells were confirmed by Sangersequencing. FIG. 32B: Plasmids harboring either the wild-type PIK3CAsequence (SEQ ID NO: 273) or mutant PIK3CA sequence (SEQ ID NO: 274)were digested using RGENs with perfectly matched crRNAs or attenuated,one-base mismatched crRNAs: m5 (SEQ ID NO: 275), m6 (SEQ ID NO: 276), m7(SEQ ID NO: 277), m10 (SEQ ID NO: 278), m13 (SEQ ID NO: 279), m16 (SEQID NO: 280), m19 (SEQ ID NO: 281), m4 (SEQ ID NO:274), m4,5 (SEQ ID NO:282), m4,6 (SEQ ID NO: 283), m4,7 (SEQ ID NO: 284), m4,10 (SEQ ID NO:285), m4,13 (SEQ ID NO: 286), m4,16 (SEQ ID NO: 287), and m4,19 (SEQ IDNO: 288). Attenuated crRNAs that were chosen for genotyping are labeledin box above the gels.

FIGS. 33A, 33B, 33C, and 33D show genotyping of recurrent pointmutations in cancer cell lines. FIG. 33A: RGEN-RFLP assays todistinguish between a wild-type IDH gene sequence (SEQ ID NO: 289) and arecurrent oncogenic point mutation sequence in the IDH gene (c.394c>T;SEQ ID NO: 290). RGENs with attenuated, one-base mismatched crRNAs, SEQID NO: 291 (WT-Specific RNA) and SEQ ID NO: 292 (Mutant-Specific RNA),distinguished the wild type and mutant IDH sequences. FIG. 33B:RGEN-RFLP assays to distinguish between a wild-type PIK3CA gene sequence(SEQ ID NO: 271) and a recurrent oncogenic point mutation sequence inthe PIK3CA gene (c.3140A>G; SEQ ID NO: 273). RGENs with attenuated,one-base mismatched crRNAs, SEQ ID NO: 275 (WT-Specific RNA) and SEQ IDNO: 284 (Mutant-Specific RNA), distinguished the wild type and mutantPIK3CA sequences. FIG. 33C: RGEN-RFLP assays to distinguish between awild-type NRAS gene sequence (SEQ ID NO: 293) and a recurrent oncogenicpoint mutation sequence in the NRAS gene (c.181C>A; SEQ ID NO: 294).RGENs with perfectly matched crRNAs, SEQ ID NO: 293 (WT-Specific RNA)and SEQ ID NO: 294 (Mutant-Specific RNA), distinguished the wild typeand mutant NRAS sequences. FIG. 33D: RGEN-RFLP assays to distinguishbetween a wild-type BRAF gene sequence (SEQ ID NO: 295) and a recurrentoncogenic point mutation sequence in the BRAF gene (c.1799T>A; SEQ IDNO: 296). RGENs with perfectly matched crRNAs, SEQ ID NO: 295(WT-Specific RNA) and SEQ ID NO: 296 (Mutant-Specific RNA),distinguished the wild type and mutant BRAF sequences. Genotypes of eachcell line confirmed by Sanger sequencing are shown. Mismatchednucleotides are shown in box. Black arrows indicate DNA bands cleaved byRGENs.

BEST MODE FOR CARRYING OUT THE INVENTION

In accordance with one aspect of the invention, the present inventionprovides a composition for cleaving target DNA in eukaryotic cells ororganisms comprising a guide RNA specific for target DNA or DNA thatencodes the guide RNA, and Cas protein-encoding nucleic acid or Casprotein. In addition, the present invention provides a use of thecomposition for cleaving target DNA in eukaryotic cells or organismscomprising a guide RNA specific for target DNA or DNA that encodes theguide RNA, and Cas protein-encoding nucleic acid or Cas protein.

In the present invention, the composition is also referred to as aRNA-guided endonuclease (RGEN) composition.

ZFNs and TALENs enable targeted mutagenesis in mammalian cells, modelorganisms, plants, and livestock, but the mutation frequencies obtainedwith individual nucleases are widely different from each other.Furthermore, some ZFNs and TALENs fail to show any genome editingactivities. DNA methylation may limit the binding of these engineerednucleases to target sites. In addition, it is technically challengingand time-consuming to make customized nucleases.

The present inventors have developed a new RNA-guided endonucleasecomposition based on Cas protein to overcome the disadvantages of ZFNsand TALENs.

Prior to the present invention, an endonuclease activity of Cas proteinshas been known. However, it has not been known whether the endonucleaseactivity of Cas protein would function in an eukaryotic cell because ofthe complexity of the eukaryotic genome. Further, until now, acomposition comprising Cas protein or Cas protein-encoding nucleic acidand a guide RNA specific for the target DNA to cleave a target DNA ineukaryotic cells or organisms has not been developed.

Compared to ZFNs and TALENs, the present RGEN composition based on Casprotein can be more readily customized because only the synthetic guideRNA component is replaced to make a new genome-editing nuclease. Nosub-cloning steps are involved to make customized RNA guidedendonucleases. Furthermore, the relatively small size of the Cas gene(for example, 4.2 kbp for Cas9) as compared to a pair of TALEN genes (˜6kbp) provides an advantage for this RNA-guided endonuclease compositionin some applications such as virus-mediated gene delivery. Further, thisRNA-guided endonuclease does not have off-target effects and thus doesnot induce unwanted mutations, deletion, inversions, and duplications.These features make the present RNA-guided endonuclease composition ascalable, versatile, and convenient tool for genome engineering ineukaryotic cells and organisms. In addition, RGEN can be designed totarget any DNA sequence, almost any single nucleotide polymorphism orsmall insertion/deletion (indel) can be analyzed via RGEN-mediated RFLP.The specificity of RGENs is determined by the RNA component thathybridizes with a target DNA sequence of up to 20 base pairs (bp) inlength and by the Cas9 protein that recognizes the protospacer-adjacentmotif (PAM). RGENs are readily reprogrammed by replacing the RNAcomponent. Therefore, RGENs provide a platform to use simple and robustRFLP analysis for various sequence variations.

The target DNA may be an endogenous DNA, or artificial DNA, preferably,endogenous DNA.

As used herein, the term “Cas protein” refers to an essential proteincomponent in the CRISPR/Cas system, forms an active endonuclease ornickase when complexed with two RNAs termed CRISPR RNA (crRNA) andtrans-activating crRNA (tracrRNA).

The information on the gene and protein of Cas are available fromGenBank of National Center for Biotechnology Information (NCBI), withoutlimitation.

The CRISPR-associated (cas) genes encoding Cas proteins are oftenassociated with CRISPR repeat-spacer arrays. More than forty differentCas protein families have been described. Of these protein families,Cas1 appears to be ubiquitous among different CRISPR/Cas systems. Thereare three types of CRISPR-Cas system. Among them, Type II CRISPR/Cassystem involving Cas9 protein and crRNA and tracrRNA is representativeand is well known. Particular combinations of cas genes and repeatstructures have been used to define 8 CRISPR subtypes (E. coli, Ypest,Nmeni, Dvulg, Tneap, Hmari, Apern, and Mtube).

The Cas protein may be linked to a protein transduction domain. Theprotein transduction domain may be poly-arginine or a TAT proteinderived from HIV, but it is not limited thereto.

The present composition may comprise Cas component in the form of aprotein or in the form of a nucleic acid encoding Cas protein.

In the present invention, Cas protein may be any Cas protein providedthat it has an endonuclease or nickase activity when complexed with aguide RNA.

Preferably, Cas protein is Cas9 protein or variants thereof.

The variant of the Cas9 protein may be a mutant form of Cas9 in whichthe catalytic asapartate residue is changed to any other amino acid.Preferably, the other amino acid may be an alanine, but it is notlimited thereto.

Further, Cas protein may be the one isolated from an organism such asStreptococcus sp., preferably Streptococcus pyogenes or a recombinantprotein, but it is not limited thereto.

The Cas protein derived from Streptococcus pyogenes may recognize NGGtrinucleotide. The Cas protein may comprise an amino acid sequence ofSEQ ID NO: 109, but it is not limited thereto.

The term “recombinant” when used with reference, e.g., to a cell,nucleic acid, protein, or vector, indicates that the cell, nucleic acid,protein or vector, has been modified by the introduction of aheterologous nucleic acid or protein or the alteration of a nativenucleic acid or protein, or that the cell is derived from a cell somodified. Thus, for example, a recombinant Cas protein may be generatedby reconstituting Cas protein-encoding sequence using the human codontable.

As for the present invention, Cas protein-encoding nucleic acid may be aform of vector, such as plasmid comprising Cas-encoding sequence under apromoter such as CMV or CAG. When Cas protein is Cas9, Cas9 encodingsequence may be derived from Streptococcus sp., and preferably derivedfrom Streptococcus pyogenes. For example, Cas9 encoding nucleic acid maycomprise the nucleotide sequence of SEQ ID. NO: 1. Moreover, Cas9encoding nucleic acid may comprise the nucleotide sequence havinghomology of at least 50% to the sequence of SEQ ID NO: 1, preferably atleast 60, 70, 80, 90, 95, 97, 98, or 99% to the SEQ ID NO:1, but it isnot limited thereto. Cas9 encoding nucleic acid may comprise thenucleotide sequence of SEQ ID NOs. 108, 110, 106, or 107.

As used herein, the term “guide RNA” refers to a RNA which is specificfor the target DNA and can forma complex with Cas protein and bring Casprotein to the target DNA.

In the present invention, the guide RNA may consist of two RNA, i.e.,CRISPR RNA(crRNA) and transactivating crRNA(tracrRNA) or be asingle-chain RNA(sgRNA) produced by fusion of an essential portion ofcrRNA and tracrRNA.

The guide RNA may be a dual RNA comprising a crRNA and a tracrRNA.

If the guide RNA comprises the essential portion of crRNA and tracrRNAand a portion complementary to a target, any guide RNA may be used inthe present invention.

The crRNA may hybridize with a target DNA.

The RGEN may consist of Cas protein, and dual RNA (invariable tracrRNAand target-specific crRNA), or Cas protein and sgRNA (fusion of anessential portion of invariable tracrRNA and target-specific crRNA), andmay be readily reprogrammed by replacing crRNA.

The guide RNA further comprises one or more additional nucleotides atthe 5′ end of the single-chain guide RNA or the crRNA of the dual RNA.

Preferably, the guide RNA further comprises 2-additional guaninenucleotides at the 5′ end of the single-chain guide RNA or the crRNA ofthe dual RNA.

The guide RNA may be transferred into a cell or an organism in the formof RNA or DNA that encodes the guide RNA. The guide RNA may be in theform of an isolated RNA, RNA incorporated into a viral vector, or isencoded in a vector. Preferably, the vector may be a viral vector,plasmid vector, or agrobacterium vector, but it is not limited thereto.

A DNA that encodes the guide RNA may be a vector comprising a sequencecoding for the guide RNA. For example, the guide RNA may be transferredinto a cell or organism by transfecting the cell or organism with theisolated guide RNA or plasmid DNA comprising a sequence coding for theguide RNA and a promoter.

Alternatively, the guide RNA may be transferred into a cell or organismusing virus-mediated gene delivery.

When the guide RNA is transfected in the form of an isolated RNA into acell or organism, the guide RNA may be prepared by in vitrotranscription using any in vitro transcription system known in the art.The guide RNA is preferably transferred to a cell in the form ofisolated RNA rather than in the form of plasmid comprising encodingsequence for a guide RNA. As used herein, the term “isolated RNA” may beinterchangeable to “naked RNA”. This is cost- and time-saving because itdoes not require a step of cloning. However, the use of plasmid DNA orvirus-mediated gene delivery for transfection of the guide RNA is notexcluded.

The present RGEN composition comprising Cas protein or Casprotein-encoding nucleic acid and a guide RNA can specifically cleave atarget DNA due to a specificity of the guide RNA for a target and anendonuclease or nickase activity of Cas protein.

As used herein, the term “cleavage” refers to the breakage of thecovalent backbone of a nucleotide molecule.

In the present invention, a guide RNA may be prepared to be specific forany target which is to be cleaved. Therefore, the present RGENcomposition can cleave any target DNA by manipulating or genotyping thetarget-specific portion of the guide RNA.

The guide RNA and the Cas protein may function as a pair. As usedherein, the term “paired Cas nickase” may refer to the guide RNA and theCas protein functioning as a pair. The pair comprises two guide RNAs.The guide RNA and Cas protein may function as a pair, and induce twonicks on different DNA strand. The two nicks may be separated by atleast 100 bps, but are not limited thereto.

In the Example, the present inventors confirmed that paired Cas nickaseallow targeted mutagenesis and large deletions of up to 1-kbpchromosomal segments in human cells. Importantly, paired nickases didnot induce indels at off-target sites at which their correspondingnucleases induce mutations. Furthermore, unlike nucleases, pairednickases did not promote unwanted translocations associated withoff-target DNA cleavages. In principle, paired nickases double thespecificity of Cas9-mediated mutagenesis and will broaden the utility ofRNA-guided enzymes in applications that require precise genome editingsuch as gene and cell therapy.

In the present invention, the composition may be used in the genotypingof a genome in the eukaryotic cells or organisms in vitro.

In one specific embodiment, the guide RNA may comprise the nucleotidesequence of Seq ID. No. 1, wherein the portion of nucleotide position3˜22 is a target-specific portion and thus, the sequence of this portionmay be changed depending on a target.

As used herein, a eukaryotic cell or organism may be yeast, fungus,protozoa, plant, higher plant, and insect, or amphibian cells, ormammalian cells such as CHO, HeLa, HEK293, and COS-1, for example,cultured cells (in vitro), graft cells and primary cell culture (invitro and ex vivo), and in vivo cells, and also mammalian cellsincluding human, which are commonly used in the art, without limitation.

In one specific embodiment, it was found that Cas9 protein/single-chainguide RNA could generate site-specific DNA double-strand breaks in vitroand in mammalian cells, whose spontaneous repair induced targeted genomemutations at high frequencies.

Moreover, it was found that gene-knockout mice could be induced by theinjection of Cas9 protein/guide RNA complexes or Cas9 mRNA/guide RNAinto one-cell stage embryo and germ-line transmittable mutations couldbe generated by Cas9/guide RNA system.

Using Cas protein rather than a nucleic acid encoding Cas protein toinduce a targeted mutagenesis is advantageous because exogeneous DNA isnot introduced into an organism. Thus, the composition comprising Casprotein and a guide RNA may be used to develop therapeutics orvalue-added crops, livestock, poultry, fish, pets, etc.

In accordance with another aspect of the invention, the presentinvention provides a composition for inducing targeted mutagenesis ineukaryotic cells or organisms, comprising a guide RNA specific fortarget DNA or DNA that encodes the guide RNA, and Cas protein-encodingnucleic acid or Cas protein. In addition, the present invention providesa use of the composition for inducing targeted mutagenesis in eukaryoticcells or organisms, comprising a guide RNA specific for target DNA orDNA that encodes the guide RNA, and Cas protein-encoding nucleic acid orCas protein.

A guide RNA, Cas protein-encoding nucleic acid or Cas protein are asdescribed in the above.

In accordance with another aspect of the invention, the presentinvention provides a kit for cleaving a target DNA or inducing targetedmutagenesis in eukaryotic cells or organisms comprising a guide RNAspecific for target DNA or DNA that encodes the guide RNA, and Casprotein-encoding nucleic acid or Cas protein.

A guide RNA, Cas protein-encoding nucleic acid or Cas protein are asdescribed in the above.

The kit may comprise a guide RNA and Cas protein-encoding nucleic acidor Cas protein as separate components or as one composition.

The present kit may comprise some additional components necessary fortransferring the guide RNA and Cas component to a cell or an organism.For example, the kit may comprise an injection buffer such asDEPC-treated injection buffer, and materials necessary for analysis ofmutation of a target DNA, but are not limited thereto.

In accordance with another aspect, the present invention provides amethod for preparing a eukaryotic cell or organism comprising Casprotein and a guide RNA comprising a step of co-transfecting orserial-transfecting the eukaryotic cell or organism with a Casprotein-encoding nucleic acid or Cas protein, and a guide RNA or DNAthat encodes the guide RNA.

A guide RNA, Cas protein-encoding nucleic acid or Cas protein are asdescribed in the above.

In the present invention, a Cas protein-encoding nucleic acid or Casprotein and a guide RNA or DNA that encodes the guide RNA may betransferred into a cell by various methods known in the art, such asmicroinjection, electroporation, DEAE-dextran treatment, lipofection,nanoparticle-mediated transfection, protein transduction domain mediatedtransduction, virus-mediated gene delivery, and PEG-mediatedtransfection in protoplast, and so on, but are not limited thereto.Also, a Cas protein encoding nucleic acid or Cas protein and a guide RNAmay be transferred into an organism by various method known in the artto administer a gene or a protein such as injection. A Casprotein-encoding nucleic acid or Cas protein may be transferred into acell in the form of complex with a guide RNA, or separately. Cas proteinfused to a protein transduction domain such as Tat can also be deliveredefficiently into cells.

Preferably, the eukaryotic cell or organism is co-transfected orserial-transfected with a Cas9 protein and a guide RNA.

The serial-transfection may be performed by transfection with Casprotein-encoding nucleic acid first, followed by second transfectionwith naked guide RNA. Preferably, the second transfection is after 3, 6,12, 18, 24 hours, but it is not limited thereto.

In accordance with another aspect, the present invention provides aeukaryotic cell or organism comprising a guide RNA specific for targetDNA or DNA that encodes the guide RNA, and Cas protein-encoding nucleicacid or Cas protein.

The eukaryotic cells or organisms may be prepared by transferring thecomposition comprising a guide RNA specific for target DNA or DNA thatencodes the guide RNA, and Cas protein-encoding nucleic acid or Casprotein into the cell or organism.

The eukaryotic cell may be yeast, fungus, protozoa, higher plant, andinsect, or amphibian cells, or mammalian cells such as CHO, HeLa,HEK293, and COS-1, for example, cultured cells (in vitro), graft cellsand primary cell culture (in vitro and ex vivo), and in vivo cells, andalso mammalian cells including human, which are commonly used in theart, without limitation. Further the organism may be yeast, fungus,protozoa, plant, higher plant, insect, amphibian, or mammal.

In accordance with another aspect of the invention, the presentinvention provides a method for cleaving a target DNA or inducingtargeted mutagenesis in eukaryotic cells or organisms, comprising a stepof treating a cell or organism comprising a target DNA with acomposition comprising a guide RNA specific for target DNA or DNA thatencodes the guide RNA, and Cas protein-encoding nucleic acid or Casprotein.

The step of treating a cell or organism with the composition may beperformed by transferring the present composition comprising a guide RNAspecific for target DNA or DNA that encodes the guide RNA, and Casprotein-encoding nucleic acid or Cas protein into the cell or organism.

As described in the above, such transfer may be performed bymicroinjection, transfection, electroporation, and so on.

In accordance with another aspect of the invention, the presentinvention provides an embryo comprising a genome edited by the presentRGEN composition comprising a guide RNA specific for target DNA or DNAthat encodes the guide RNA, and Cas protein-encoding nucleic acid or Casprotein.

Any embryo can be used in the present invention, and for the presentinvention, the embryo may be an embryo of a mouse. The embryo may beproduced by injecting PMSG (Pregnant Mare Serum Gonadotropin) and hCG(human Chorionic Gonadotropin) into a female mouse of 4 to 7 weeks andthe super-ovulated female mouse may be mated to males, and thefertilized embryos may be collected from oviducts.

The present RGEN composition introduced into an embryo can cleave atarget DNA complementary to the guide RNA by the action of Cas proteinand cause a mutation in the target DNA. Thus, the embryo into which thepresent RGEN composition has been introduced has an edited genome.

In one specific embodiment, it was found that the present RGENcomposition could cause a mutation in a mouse embryo and the mutationcould be transmitted to offspring.

A method for introducing the RGEN composition into the embryo may be anymethod known in the art, such as microinjection, stem cell insertion,retrovirus insertion, and so on. Preferably, a microinjection techniquecan be used.

In accordance with another aspect, the present invention provides agenome-modified animal obtained by transferring the embryo comprising agenome edited by the present RGEN composition into the oviducts of ananimal.

In the present invention, the term “genome-modified animal” refers to ananimal of which genome has been modified in the stage of embryo by thepresent RGEN composition and the type of the animal is not limited.

The genome-modified animal has mutations caused by a targetedmutagenesis based on the present RGEN composition. The mutations may beany one of deletion, insertion, translocation, inversion. The site ofmutation depends on the sequence of guide RNA of the RGEN composition.

The genome-modified animal having a mutation of a gene may be used todetermine the function of the gene.

In accordance with another aspect of the invention, the presentinvention provides a method of preparing a genome-modified animalcomprising a step of introducing the present RGEN composition comprisinga guide RNA specific for the target DNA or DNA that encodes the guideRNA and Cas protein-encoding nucleic acid or Cas protein into an embryoof an animal; and a step of transferring the embryo into a oviduct ofpseudopregnant foster mother to produce a genome-modified animal.

The step of introducing the present RGEN composition may be accomplishedby any method known in the art such as microinjection, stem cellinsertion, retroviral insertion, and so on.

In accordance with another aspect of the invention, the presentinvention provides a plant regenerated form the genome-modifiedprotoplasts prepared by the method for eukaryotic cells comprising theRGEN composition.

In accordance with another aspect of the invention, the presentinvention provides a composition for genotyping mutations or variationsin an isolated biological sample, comprising a guide RNA specific forthe target DNA sequence Cas protein. In addition, the present inventionprovides a composition for genotyping nucleic acid sequences inpathogenic microorganisms in an isolated biological sample, comprising aguide RNA specific for the target DNA sequence and Cas protein.

A guide RNA, Cas protein-encoding nucleic acid or Cas protein are asdescribed in the above.

As used herein the term “genotyping” refers to the “Restriction fragmentlength polymorphism (RFLP) assay”.

RFLP may be used in 1) the detection of indel in cells or organismsinduced by the engineered nucleases, 2) the genotypingnaturally-occurring mutations or variations in cells or organisms, or 3)the genotyping the DNA of infected pathogenic microorganisms includingvirus or bacteria, etc.

The mutations or variation may be induced by engineered nucleases incells.

The engineered nuclease may be a Zinc Finger Nuclease (ZFNs),Transcription Activator-Like Effector Nucleases (TALENs), or RGENs, butit is not limited thereto.

As used herein the term “biological sample” includes samples foranalysis, such as tissues, cells, whole blood, serum, plasma, saliva,sputum, cerebrospinal fluid or urine, but is not limited thereto.

The mutations or variation may be a naturally-occurring mutations orvariations.

The mutations or variations are induced by the pathogenicmicroorganisms. Namely, the mutations or variations occur due to theinfection of pathogenic microorganisms, when the pathogenicmicroorganisms are detected, the biological sample is identified asinfected.

The pathogenic microorganisms may be virus or bacteria, but are notlimited thereto.

Engineered nuclease-induced mutations are detected by various methods,which include mismatch-sensitive Surveyor or T7 endonuclease I (T7E1)assays, RFLP analysis, fluorescent PCR, DNA melting analysis, and Sangerand deep sequencing. The T7E1 and Surveyor assays are widely used butoften underestimate mutation frequencies because the assays detectheteroduplexes (formed by the hybridization of mutant and wild-typesequences or two different mutant sequences); they fail to detecthomoduplexes formed by the hybridization of two identical mutantsequences. Thus, these assays cannot distinguish homozygous biallelicmutant clones from wild-type cells nor heterozygous biallelic mutantsfrom heterozygous monoallelic mutants (FIG. 22 ). In addition, sequencepolymorphisms near the nuclease target site can produce confoundingresults because the enzymes can cleave heteroduplexes formed byhybridization of these different wild-type alleles. RFLP analysis isfree of these limitations and therefore is a method of choice. Indeed,RFLP analysis was one of the first methods used to detect engineerednuclease-mediated mutations. Unfortunately, however, it is limited bythe availability of appropriate restriction sites.

In accordance with another aspect of the invention, the presentinvention provides a kit for genotyping mutations or variations in anisolated biological sample, comprising the composition for genotypingmutations or variations in an isolated biological sample. In addition,the present invention provides a kit for genotyping nucleic acidsequences in pathogenic microorganisms in an isolated biological sample,comprising a guide RNA specific for the target DNA sequence and Casprotein.

A guide RNA, Cas protein-encoding nucleic acid or Cas protein are asdescribed in the above.

In accordance with another aspect of the invention, the presentinvention provides a method of genotyping mutations or variations in anisolated biological sample, using the composition for genotypingmutations or variations in an isolated biological sample. In addition,the present invention provides a method of genotyping nucleic acidsequences in pathogenic microorganisms in an isolated biological sample,comprising a guide RNA specific for the target DNA sequence and Casprotein.

A guide RNA, Cas protein-encoding nucleic acid or Cas protein are asdescribed in the above.

MODE FOR THE INVENTION

Hereinafter, the present invention will be described in more detail withreference to Examples. However, these Examples are for illustrativepurposes only, and the invention is not intended to be limited by theseExamples.

Example 1: Genome Editing Assay

1-1. DNA Cleavage Activity of Cas9 Protein

Firstly, the DNA cleavage activity of Cas9 derived from Streptococcuspyogenes in the presence or absence of a chimeric guide RNA in vitro wastested.

To this end, recombinant Cas9 protein that was expressed in and purifiedfrom E. coli was used to cleave a predigested or circular plasmid DNAthat contained the 23-base pair (bp) human CCR5 target sequence. A Cas9target sequence consists of a 20-bp DNA sequence complementary to crRNAor a chimeric guide RNA and the trinucleotide (5′-NGG-3′) protospaceradjacent motif (PAM) recognized by Cas9 itself (FIG. 1A).

Specifically, the Cas9-coding sequence (4,104 bp), derived fromStreptococcus pyogenes strain M1 GAS (NC 002737.1), was reconstitutedusing the human codon usage table and synthesized usingoligonucleotides. First, 1-kb DNA segments were assembled usingoverlapping ˜35-mer oligonucleotides and Phusion polymerase (New EnglandBiolabs) and cloned into T-vector (SolGent). A full-length Cas9 sequencewas assembled using four 1-kbp DNA segments by overlap PCR. TheCas9-encoding DNA segment was subcloned into p3s, which was derived frompcDNA3.1 (Invitrogen). In this vector, a peptide tag(NH2-GGSGPPKKKRKVYPYDVPDYA-COOH, SEQ ID NO: 2) containing the HA epitopeand a nuclear localization signal (NLS) was added to the C-terminus ofCas9. Expression and nuclear localization of the Cas9 protein in HEK293T cells were confirmed by western blotting using anti-HA antibody(Santa Cruz).

Then, the Cas9 cassette was subcloned into pET28-b(+) and transformedinto BL21 (DE3). The expression of Cas9 was induced using 0.5 mM IPTGfor 4 hat 25° C. The Cas9 protein containing the His6-tag at the Cterminus was purified using Ni-NTA agarose resin (Qiagen) and dialyzedagainst 20 mM HEPES (pH 7.5), 150 mM KCl, 1 mM DTT, and 10% glycerol(1). Purified Cas9 (50 nM) was incubated with super-coiled orpre-digested plasmid DNA (300 ng) and chimeric RNA (50 nM) in a reactionvolume of 20 μl in NEB buffer 3 for 1 h at 37° C. Digested DNA wasanalyzed by electrophoresis using 0.8% agarose gels.

Cas9 cleaved the plasmid DNA efficiently at the expected position onlyin the presence of the synthetic RNA and did not cleave a controlplasmid that lacked the target sequence (FIG. 1B).

1-2. DNA Cleavage by Cas9/Guide RNA Complex in Human Cells

A RFP-GFP reporter was used to investigate whether the Cas9/guide RNAcomplex can cleave the target sequence incorporated between the RFP andGFP sequences in mammalian cells.

In this reporter, the GFP sequence is fused to the RFP sequenceout-of-frame (2). The active GFP is expressed only when the targetsequence is cleaved by site-specific nucleases, which causesframeshifting small insertions or deletions (indels) around the targetsequence via error-prone non-homologous end-joining (NHEJ) repair of thedouble-strand break (DSB) (FIG. 2 ).

The RFP-GFP reporter plasmids used in this study were constructed asdescribed previously (2). Oligonucleotides corresponding to target sites(Table 1) were synthesized (Macrogen) and annealed. The annealedoligonucleotides were ligated into a reporter vector digested with EcoRIand BamHI.

HEK 293T cells were co-transfected with Cas9-encoding plasmid (0.8 μg)and the RFP-GFP reporter plasmid (0.2 μg) in a 24-well plate usingLipofectamine 2000 (Invitrogen).

Meanwhile, the in vitro transcribed chimeric RNA had been prepared asfollows. RNA was in vitro transcribed through run-off reactions usingthe MEGAshortscript T7 kit (Ambion) according to the manufacturer'smanual. Templates for RNA in vitro transcription were generated byannealing two complementary single strand DNAs or by PCR amplification(Table 1). Transcribed RNA was resolved on a 8% denaturing urea-PAGEgel. The gel slice containing RNA was cut out and transferred to probeelution buffer. RNA was recovered in nuclease-free water followed byphenol:chloroform extraction, chloroform extraction, and ethanolprecipitation. Purified RNAs were quantified by spectrometry.

At 12h post transfection, chimeric RNA (1 μg) prepared by in vitrotranscription was transfected using Lipofectamine 2000.

At 3d post-transfection, transfected cells were subjected to flowcytometry and cells expressing both RFP and GFP were counted.

It was found that GFP-expressing cells were obtained only when the cellswere transfected first with the Cas9 plasmid and then with the guide RNA12 h later (FIG. 2 ), demonstrating that RGENs could recognize andcleave the target DNA sequence in cultured human cells. ThusGFP-expressing cells were obtained by serial-transfection of the Cas9plasmid and the guide RNA rather than co-transfection.

TABLE 1 Gene sequence (5′ to 3′) SEQ ID NO.Oligonucleotides used for the construction of the reporter plasmid CCR5F AATTCATGACATCAATTATTATACATCGGAGGAG 3 RGATCCTCCTCCGATGTATAATAATTGATGTCATG 4 Primers used in the T7E1 assay CCR5F1 CTCCATGGTGCTATAGAGCA 5 F2 GAGCCAAGCTCTCCATCTAGT 6 RGCCCTGTCAAGAGTTGACAC 7 C4BPB F1 TATTTGGCTGGTTGAAAGGG 8 R1AAAGTCATGAAATAAACACACCCA 9 F2 CTGCATTGATATGGTAGTACCATG 10 R2GCTGTTCATTGCAATGGAATG 11Primers used for the amplification of off-target sites ADCY5 F1GCTCCCACCTTAGTGCTCTG 12 R1 GGTGGCAGGAACCTGTATGT 13 F2GTCATTGGCCAGAGATGTGGA 14 R2 GTCCCATGACAGGCGTGTAT 15 KCNJ6 FGCCTGGCCAAGTTTCAGTTA 16 R1 TGGAGCCATTGGTTTGCATC 17 R2CCAGAACTAAGCCGTTTCTGAC 18 CNTNAP2 F1 ATCACCGACAACCAGTTTCC 19 F2TGCAGTGCAGACTCTTTCCA 20 R AAGGACACAGGGCAACTGAA 21 N/A Chr. F1TGTGGAACGAGTGGTGACAG 22 5 R1 GCTGGATTAGGAGGCAGGATTC 23 F2GTGCTGAGAACGCTTCATAGAG 24 R2 GGACCAAACCACATTCTTCTCAC 25Primers used for the detection of chromosomal deletions Deletion FCCACATCTCGTTCTCGGTTT 26 R TCACAAGCCCACAGATATTT 27

1-3. Targeted Disruption of Endogenous Genes in Mammalian Cells by RGEN

To test whether RGENs could be used for targeted disruption ofendogenous genes in mammalian cells, genomic DNA isolated fromtransfected cells using T7 endonuclease I (T7E1), a mismatch-sensitiveendonuclease that specifically recognizes and cleaves heteroduplexesformed by the hybridization of wild-type and mutant DNA sequences wasanalyzed (3).

To introduce DSBs in mammalian cells using RGENs, 2×10⁶ K562 cells weretransfected with 20 μg of Cas9-encoding plasmid using the4D-Nucleofector, SF Cell Line 4D-Nucleofector X Kit, Program FF-120(Lonza) according to the manufacturer's protocol. For this experiment,K562 (ATCC, CCL-243) cells were grown in RPMI-1640 with 10% FBS and thepenicillin/streptomycin mix (100 U/ml and 100 μg/ml, respectively).

After 24 h, 10-40 μg of in vitro transcribed chimeric RNA wasnucleofected into 1×10⁶ K562 cells. The in vitro transcribed chimericRNA had been prepared as described in the Example 1-2.

Cells were collected two days after RNA transfection and genomic DNA wasisolated. The region including the target site was PCR-amplified usingthe primers described in Table 1. The amplicons were subjected to theT7E1 assay as described previously (3). For sequencing analysis, PCRproducts corresponding to genomic modifications were purified and clonedinto the T-Blunt vector using the T-Blunt PCR Cloning Kit (SolGent).Cloned products were sequenced using the M13 primer.

It was found that mutations were induced only when the cells weretransfected serially with Cas9-encoding plasmid and then with guide RNA(FIG. 3 ). Mutation frequencies (Indels (%) in FIG. 3A) estimated fromthe relative DNA band intensities were RNA-dosage dependent, rangingfrom 1.3% to 5.1%. DNA sequencing analysis of the PCR ampliconscorroborated the induction of RGEN-mediated mutations at the endogenoussites. Indels and microhomologies, characteristic of error-prone NHEJ,were observed at the target site. The mutation frequency measured bydirect sequencing was 7.3% (=7 mutant clones/96 clones), on par withthose obtained with zinc finger nucleases (ZFNs) ortranscription-activator-like effector nucleases (TALENs).

Serial-transfection of Cas9 plasmid and guide RNA was required to inducemutations in cells. But when plasmids that encode guide RNA, serialtransfection was unnecessary and cells were co-transfected with Cas9plasmid and guide RNA-encoding plasmid.

In the meantime, both ZFNs and TALENs have been successfully developedto disrupt the human CCR5 gene (3-6), which encodes a G-protein-coupledchemokine receptor, an essential co-receptor of HIV infection. ACCR5-specific ZFN is now under clinical investigation in the US for thetreatment of AIDS (7). These ZFNs and TALENs, however, have off-targeteffects, inducing both local mutations at sites whose sequences arehomologous to the on-target sequence (6, 8-10) and genome rearrangementsthat arise from the repair of two concurrent DSBs induced at on-targetand off-target sites (11-12). The most striking off-target sitesassociated with these CCR5-specific engineered nucleases reside in theCCR2 locus, a close homolog of CCR5, located 15-kbp upstream of CCR5. Toavoid off-target mutations in the CCR2 gene and unwanted deletions,inversions, and duplications of the 15-kbp chromosomal segment betweenthe CCR5 on-target and CCR2 off-target sites, the present inventorsintentionally chose the target site of our CCR5-specific RGEN torecognize a region within the CCR5 sequence that has no apparenthomology with the CCR2 sequence.

The present inventors investigated whether the CCR5-specific RGEN hadoff-target effects. To this end, we searched for potential off-targetsites in the human genome by identifying sites that are most homologousto the intended 23-bp target sequence. As expected, no such sites werefound in the CCR2 gene. Instead, four sites, each of which carries3-base mismatches with the on-target site, were found (FIG. 4A). TheT7E1 assays showed that mutations were not detected at these sites(assay sensitivity, −0.5%), demonstrating exquisite specificities ofRGENs (FIG. 4B). Furthermore, PCR was used to detect the induction ofchromosomal deletions in cells separately transfected with plasmidsencoding the ZFN and RGEN specific to CCR5. Whereas the ZFN induceddeletions, the RGEN did not (FIG. 4C).

Next, RGENs was reprogrammed by replacing the CCR5-specific guide RNAwith a newly-synthesized RNA designed to target the human C4BPB gene,which encodes the beta chain of C4b-binding protein, a transcriptionfactor. This RGEN induced mutations at the chromosomal target site inK562 cells at high frequencies (FIG. 3B). Mutation frequencies measuredby the T7E1 assay and by direct sequencing were 14% and 8.3% (=4 mutantclones/48 clones), respectively. Out of four mutant sequences, twoclones contained a single-base or two-base insertion precisely at thecleavage site, a pattern that was also observed at the CCR5 target site.These results indicate that RGENs cleave chromosomal target DNA atexpected positions in cells.

Example 2: Proteinaceous RGEN-Mediated Genome Editing

RGENs can be delivered into cells in many different forms. RGENs consistof Cas9 protein, crRNA, and tracrRNA. The two RNAs can be fused to forma single-chain guide RNA (sgRNA). A plasmid that encodes Cas9 under apromoter such as CMV or CAG can be transfected into cells. crRNA,tracrRNA, or sgRNA can also be expressed in cells using plasmids thatencode these RNAs. Use of plasmids, however, often results inintegration of the whole or part of the plasmids in the host genome. Thebacterial sequences incorporated in plasmid DNA can cause unwantedimmune response in vivo. Cells transfected with plasmid for cell therapyor animals and plants derived from DNA-transfected cells must go througha costly and lengthy regulation procedure before market approval in mostdeveloped countries. Furthermore, plasmid DNA can persist in cells forseveral days post-transfection, aggravating off-target effects of RGENs.

Here, we used recombinant Cas9 protein complexed with in vitrotranscribed guide RNA to induce targeted disruption of endogenous genesin human cells. Recombinant Cas9 protein fused with the hexa-histidinetag was expressed in and purified from E. coli using standard Ni ionaffinity chromatography and gel filtration. Purified recombinant Cas9protein was concentrated in storage buffer (20 mM HEPES pH 7.5, 150 mMKCl, 1 mM DTT, and 10% glycerol). Cas9 protein/sgRNA complex wasintroduced directly into K562 cells by nucleofection: 1×10⁶ K562 cellswere transfected with 22.5-225 (1.4-14 μM) of Cas9 protein mixed with100 ug (29 μM) of in vitro transcribed sgRNA (or crRNA 40 ug andtracrRNA 80 ug) in 100 μl solution using the 4D-Nucleofector, SF CellLine 4D-Nucleofector X Kit, Program FF-120 (Lonza) according to themanufacturer's protocol. After nucleofection, cells were placed ingrowth media in 6-well plates and incubated for 48 hr. When 2×10⁵ K562cells were transfected with ⅕ scale-downed protocol, 4.5-45 μg of Cas9protein mixed with 6-60 ug of in vitro transcribed sgRNA (or crRNA 8 μgand tracrRNA 16 μg) were used and nucleofected in 20 μl solution.Nucleofected cells were then placed in growth media in 48-well plates.After 48 hr, cells were collected and genomic DNA was isolated. Thegenomic DNA region spanning the target site was PCR-amplified andsubjected to the T7E1 assay.

As shown in FIG. 10 , Cas9protein/sgRNA complex induced targetedmutation at the CCR5 locus at frequencies that ranged from 4.8 to 38% ina sgRNA or Cas9 protein dose-dependent manner, on par with the frequencyobtained with Cas9 plasmid transfection (45%). Cas9protein/crRNA/tracrRNA complex was able to induce mutations at afrequency of 9.4%. Cas9 protein alone failed to induce mutations. When2×10⁵ cells were transfected with ⅕ scale-downed doses of Cas9 proteinand sgRNA, mutation frequencies at the CCR5 locus ranged from 2.7 to 57%in a dose-dependent manner, greater than that obtained withco-transfection of Cas9 plasmid and sgRNA plasmid (32%).

We also tested Cas9 protein/sgRNA complex that targets the ABCC11 geneand found that this complex induced indels at a frequency of 35%,demonstrating general utility of this method.

TABLE 2  Sequences of guide RNA SEQ ID Target RNA typeRNA sequence (5′ to 3′) Length NO CCR5 sgRNAGGUGACAUCAAUUAUUAUACAUGUUUUAGAGCUAG 104 bp 28AAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCA ACUUGAAAAAGUGGCACCGAGUCGGUGCUUUUUUUcrRNA GGUGACAUCAAUUAUUAUACAUGUUUUAGAGCUAU  44 bp 29 GCUGUUUUG tracrRNAGGAACCAUUCAAAACAGCAUAGCAAGUUAAAAUAA  86 bp 30GGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCG AGUCGGUGCUUUUUUU

Example 3: RNA-Guided Genome Editing in Mice

To examine the gene-targeting potential of RGENs in pronuclear(PN)-stage mouse embryos, the forkhead box N1 (Foxn1) gene, which isimportant for thymus development and keratinocyte differentiation (Nehlset al., 1996), and the protein kinase, DNA activated, catalyticpolypeptide (Prkdc) gene, which encodes an enzyme critical for DNA DSBrepair and recombination (Taccioli et al., 1998) were used.

To evaluate the genome-editing activity of the Foxn1-RGEN, we injectedCas9 mRNA (10-ng/μl solution) with various doses of the sgRNA (FIG. 5 a) into the cytoplasm of PN-stage mouse embryos, and conducted T7endonuclease I (T7E1) assays (Kim et al. 2009) using genomic DNAsobtained from in vitro cultivated embryos (FIG. 6 a ).

Alternatively, we directly injected the RGEN in the form of recombinantCas9protein (0.3 to 30 ng/μl) complexed with the two-fold molar excessof Foxn1-specific sgRNA (0.14 to 14 ng/μl) into the cytoplasm orpronucleus of one-cell mouse embryos, and analyzed mutations in theFoxn1 gene using in vitro cultivated embryos (FIG. 7 ).

Specifically, Cas9 mRNA and sgRNAs were synthesized in vitro from linearDNA templates using the mMESSAGE mMACHINE T7 Ultra kit (Ambion) andMEGAshortscript T7 kit (Ambion), respectively, according to themanufacturers' instructions, and were diluted with appropriate amountsof diethyl pyrocarbonate (DEPC, Sigma)-treated injection buffer (0.25 mMEDTA, 10 mM Tris, pH 7.4). Templates for sgRNA synthesis were generatedusing oligonucleotides listed in

TABLE 3 Recombinant Cas9 protein was obtained from ToolGen, Inc. Table 3RNA Name Direction Sequence (5′ to 3′) SEQ ID NO Foxn1 #1 FGAAATTAATACGACTCACTATAGG CAGTCTGACG 31 sgRNATCACACTTCCGTTTTAGAGCTAGAAATAGCAAGT TAAAATAAGGCTAGTCCG Foxn1 #2 FGAAATTAATACGACTCACTATAGG ACTTCCAGGC 32 sgRNATCCACCCGACGTTTTAGAGCTAGAAATAGCAAGT TAAAATAAGGCTAGTCCG Foxn1 #3 FGAAATTAATACGACTCACTATAGG CCAGGCTCCA 33 sgRNACCCGACTGGAGTTTTAGAGCTAGAAATAGCAAGT TAAAATAAGGCTAGTCCG Foxn1 #4 FGAAATTAATACGACTCACTATAGG ACTGGAGGGC 34 sgRNAGAACCCCAAGGTTTTAGAGCTAGAAATAGCAAGT TAAAATAAGGCTAGTCCG Foxn1 #5 FGAAATTAATACGACTCACTATAGG ACCCCAAGGG 35 sgRNAGACCTCATGCGTTTTAGAGCTAGAAATAGCAAGT TAAAATAAGGCTAGTCCG Prkdc #1 FGAAATTAATACGACTCACTATAGG TTAGTTTTTT 36 sgRNA CCAGAGACTTGTTTTAGAGCTAGAAATAGCAAGT TAAAATAAGGCTAGTCCG Prkdc #2 FGAAATTAATACGACTCACTATAGG TTGGTTTGCT 37 sgRNA TGTGTTTATCGTTTTAGAGCTAGAAATAGCAAGT TAAAATAAGGCTAGTCCG Prkdc #3 FGAAATTAATACGACTCACTATAGG CACAAGCAAA 38 sgRNACCAAAGTCTCGTTTTAGAGCTAGAAATAGCAAGT TAAAATAAGGCTAGTCCG Prkdc #4 FGAAATTAATACGACTCACTATAGG CCTCAATGCT 39 sgRNAAAGCGACTTCGTTTTAGAGCTAGAAATAGCAAGT TAAAATAAGGCTAGTCCG

All animal experiments were performed in accordance with the Korean Foodand Drug Administration (KFDA) guidelines. Protocols were reviewed andapproved by the Institutional Animal Care and Use Committees (IACUC) ofthe Laboratory Animal Research Center at Yonsei University (PermitNumber: 2013-0099). All mice were maintained in the specificpathogen-free facility of the Yonsei Laboratory Animal Research Center.FVB/NTac (Taconic) and ICR mouse strains were used as embryo donors andfoster mothers, respectively. Female FVB/NTac mice (7-8 weeks old) weresuper-ovulated by intra-peritoneal injections of 5 IU pregnant mareserum gonadotropin (PMSG, Sigma) and 5 IU human chorionic gonadotropin(hCG, Sigma) at 48-hour intervals. The super-ovulated female mice weremated to FVB/NTac stud males, and fertilized embryos were collected fromoviducts.

Cas9 mRNA and sgRNAs in M2 medium (Sigma) were injected into thecytoplasm of fertilized eggs with well-recognized pronuclei using aPiezo-driven micromanipulator (Prime Tech).

In the case of injection of recombinant Cas9 protein, the recombinantCas9 protein: Foxn1-sgRNA complex was diluted with DEPC-treatedinjection buffer (0.25 mM EDTA, 10 mM Tris, pH 7.4) and injected intomale pronuclei using a TransferMan NK2 micromanipulator and a FemtoJetmicroinjector (Eppendorf).

The manipulated embryos were transferred into the oviducts ofpseudopregnant foster mothers to produce live animals, or werecultivated in vitro for further analyses.

To screen F0 mice and in vitro cultivated mouse embryos withRGEN-induced mutations, T7E1 assays were performed as previouslydescribed using genomic DNA samples from tail biopsies and lysates ofwhole embryos (Cho et al., 2013).

Briefly, the genomic region encompassing the RGEN target site wasPCR-amplified, melted, and re-annealed to form heteroduplex DNA, whichwas treated with T7 endonuclease 1 (New England Biolabs), and thenanalyzed by agarose gel electrophoresis. Potential off-target sites wereidentified by searching with bowtie 0.12.9 and were also similarlymonitored by T7E1 assays. The primer pairs used in these assays werelisted in Tables 4 and 5.

TABLE 4 Primers used in the T7E1 assay SEQ Direc- ID Gene tionSequence (5′ to 3′) NO Foxn1 F1 GTCTGTCTATCATCTCTTCCCTTCTCTCC 40 F2TCCCTAATCCGATGGCTAGCTCCAG 41 R1 ACGAGCAGCTGAAGTTAGCATGC 42 R2CTACTCAATGCTCTTAGAGCTACCAGGCTTGC 43 Prkdc F GACTGTTGTGGGGAGGGCCG 44 F2GGGAGGGCCGAAAGTCTTATTTTG 45 R1 CCTGAAGACTGAAGTTGGCAGAAGTGAG 46 R2CTTTAGGGCTTCTTCTCTACAATCACG 47

TABLE 5 Primers used for amplification of off-target sites SEQ Nota-Direc- ID Gene tion tion Sequence (5′ to 3′) NO Foxn1 off 1 FCTCGGTGTGTAGCCCTGAC 48 R AGACTGGCCTGGAACTCACAG 49 off 2 FCACTAAAGCCTGTCAGGAAGCCG 50 R CTGTGGAGAGCACACAGCAGC 51 off 3 FGCTGCGACCTGAGACCATG 52 R CTTCAATGGCTTCCTGCTTAGGCTAC 53 off 4 FGGTTCAGATGAGGCCATCCTTTC 54 R CCTGATCTGCAGGCTTAACCCTTG 55 Prkdc off 1 FCTCACCTGCACATCACATGTGG 56 R GGCATCCACCCTATGGGGTC 57 off 2 FGCCTTGACCTAGAGCTTAAAGAGCC 58 R GGTCTTGTTAGCAGGAAGGACACTG 59 off 3 FAAAACTCTGCTTGATGGGATATGTGG 60 G R CTCTCACTGGTTATCTGTGCTCCTTC 61 off 4 FGGATCAATAGGTGGTGGGGGATG 62 R GTGAATGACACAATGTGACAGCTTCA 63 G off 5 FCACAAGACAGACCTCTCAACATTCAG 64 TC R GTGCATGCATATAATCCATTCTGATT 65 GCTCTCoff 6 F1 GGGAGGCAGAGGCAGGT 66 F2 GGATCTCTGTGAGTTTGAGGCCA 67 R1GCTCCAGAACTCACTCTTAGGCTC 68

Mutant founders identified by the T7E1 assay were further analyzed byfPCR. Appropriate regions of genomic DNA were sequenced as describedpreviously (Sung et al., 2013). For routine PCR genotyping of F1progenies, the following primer pairs were used for both wild-type andmutant alleles: 5′-CTACTCCCTCCGCAGTCTGA-3′ (SEQ ID NO: 69) and5′-CCAGGCCTAGGTTCCAGGTA-3′ (SEQ ID NO: 70) for the Foxn1 gene,5′-CCCCAGCATTGCAGATTTCC-3′ (SEQ ID NO: 71) and5′-AGGGCTTCTTCTCTACAATCACG-3′ (SEQ ID NO: 72) for Prkdc gene.

In the case of injection of Cas9 mRNA, mutant fractions (the number ofmutant embryos/the number of total embryos) were dose-dependent, rangingfrom 33% (1 ng/μl sgRNA) to 91% (100 ng/μl) (FIG. 6 b ). Sequenceanalysis confirmed mutations in the Foxn1 gene; most mutations weresmall deletions (FIG. 6 c ), reminiscent of those induced by ZFNs andTALENs (Kim et al., 2013).

In the case of injection of Cas9 protein, these injection doses andmethods minimally affected the survival and development of mouse embryosin vitro: over 70% of RGEN-injected embryos hatched out normally in bothexperiments. Again, mutant fractions obtained with Cas9 proteininjection were dose-dependent, and reached up to 88% at the highest dosevia pronucleus injection and to 71% via intra-cytoplasmic injection(FIGS. 7 a and 7 b ). Similar to the mutation patterns induced by Cas9mRNA plus sgRNA (FIG. 6 c ), those induced by the Cas9 protein-sgRNAcomplex were mostly small deletions (FIG. 7 c ). These results clearlydemonstrate that RGENs have high gene-targeting activity in mouseembryos.

Encouraged by the high mutant frequencies and low cytotoxicity inducedby RGENs, we produced live animals by transferring the mouse embryosinto the oviducts of pseudo-pregnant foster mothers.

Notably, the birth rates were very high, ranging from 58% to 73%, andwere not affected by the increasing doses of Foxn1-sgRNA (Table 6).

TABLE 6 RGEN-mediated gene-targeting in FVB/NTac mice Transfer red TotalLive Target Cas9 mRNA + sgRNA Injected embryos newborns newborns*Founders + Gene (ng/μl) embryos (%) (%) (%) (%) Foxn1 10 + 1 76  62 (82) 45 (73)  31 (50) 12 (39) 10 + 10 104  90 (87)  52 (58)  58 (64) 33 (57)10 + 100 100  90 (90)  62 (69)  58 (64) 54 (93) Total 280 242 (86) 159(66) 147 (61) 99 (67) Prkdc 50 + 50 73  58 (79)  35 (60)  33 (57) 11(33) 50 + 100 79  59 (75)  22 (37)  21 (36)  7 (33) 50 + 250 94  73 (78) 37 (51)  37 (51) 21 (57) Total 246 190 (77)  94 (49)  91 (48) 39 (43)

Out of 147 newborns, we obtained 99 mutant founder mice. Consistent withthe results observed in cultivated embryos (FIG. 6 b ), mutant fractionswere proportional to the doses of Foxn1-sgRNA, and reached up to 93%(100 ng/μl Foxn1-sgRNA) (Tables 6 and 7, FIG. 5 b ).

TABLE 7DNA sequences of Foxn1 mutant alleles identified from a subset ofT7E1-positive mutant founders ACTTCCAGGCTCCACCCGACTGGAGGGCGAACCCCAAGGGGAFounder CCTCATGCAGG (SEQ ID NO: 134) del + ins # miceACTTCCAGGC-------------------AACCCCAAGGGGA Δ19 1  20CCTCATGCAGG (SEQ ID NO: 297) ACTTCCAGGC------------------GAACCCCAAGGGGAΔ18 1 115 CCTCATGCAGG (SEQ ID NO: 298)ACTTCCAGGCTCC----------------------------- Δ60 1  19----------- (SEQ ID NO: 299) ACTTCCAGGCTCC-----------------------------Δ44 1 108 ----------- (SEQ ID NO: 300)ACTTCCAGGCTCC---------------------CAAGGGGA Δ21 1  64CCTCATGCAGG (SEQ ID NO: 3001) ACTTCCAGGCTCC------------TTAGGAGGCGAACCCCAΔ12 + 6 1 126 AGGGGACCTCA (SEQ ID NO: 302)ACTTCCAGGCTCCACC-------------------------- Δ28 1   5--TCATGCAGG (SEQ ID NO: 303) ACTTCCAGGCTCCACCC---------------------CCAAΔ21 + 4 1  61 GGGACCTCATG (SEQ ID NO: 304)ACTTCCAGGCTCCACCC------------------AAGGGGA Δ18 2  95, 29CCTCATGCAGG (SEQ ID NO: 305) ACTTCCAGGCTCCACCC-----------------CAAGGGGAΔ17 7  12, 14, CCTCATGCAGG (SEQ ID NO: 306)  27, 66, 108, 114, 126ACTTCCAGGCTCCACCC---------------ACCCAAGGGG Δ15 + 1 1  32ACCTCATGCAG (SEQ ID NO: 307) ACTTCCAGGCTCCACCC---------------CACCCAAGGGΔ15 + 2 1 124 GACCTCATGCA (SEQ ID NO: 308)ACTTCCAGGCTCCACCC-------------ACCCCAAGGGGA Δ13 1  32CCTCATGCAGG (SEQ ID NO: 309) ACTTCCAGGCTCCACCC--------GGCGAACCCCAAGGGGAΔ8 1 110 CCTCATGCAGG (SEQ ID NO: 310)ACTTCCAGGCTCCACCCT-------------------GGGGA Δ20 + 1 1  29CCTCATGCAGG (SEQ ID NO: 311) ACTTCCAGGCTCCACCCG-----------AACCCCAAGGGGAΔ11 1 111 CCTCATGCAGG (SEQ ID NO: 312)ACTTCCAGGCTCCACCCGA----------------------A Δ22 1  79CCTCATGCAGG (SEQ ID NO: 313) ACTTCCAGGCTCCACCCGA------------------GGGGAΔ18 2  13, 127 CCTCATGCAGG (SEQ ID NO: 314)ACTTCCAGGCTCCACCCCA-----------------AGGGGA Δ17 1  24CCTCATGCAGG (SEQ ID NO: 315) ACTTCCAGGCTCCACCCGA-----------ACCCCAAGGGGAΔ11 5  14, 53, CCTCATGCAGG (SEQ ID NO: 316)  58, 69, 124ACTTCCAGGCTCCACCCGA----------GACCCCAAGGGGA Δ10 1  14CCTCATGCAGG (SEQ ID NO: 317) ACTTCCAGGCTCCACCCGA-----GGGCGAACCCCAAGGGGAΔ5 3  53, 79, CCTCATGCAGG (SEQ ID NO: 318) 115ACTTCCAGGCTCCACCCGAC---------------------- Δ23 1 108-CTCATGCAGG (SEQ ID NO: 319) ACTTCCAGGCTCCACCCGAC-----------CCCCAAGGGGAΔ11 1   3 CCTCATGCAGG (SEQ ID NO: 320)ACTTCCAGGCTCCACCCGAC-----------GAAGGGCCCCA Δ11 + 6 1  66AGGGGACCTCA (SEQ ID NO: 321) ACTTCCAGGCTCCACCCGAC--------GAACCCCAAGGGGAΔ8 2   3, 66 CCTCATGCAGG (SEQ ID NO: 322)ACTTCCAGGCTCCACCCGAC-----GGCGAACCCCAAGGGGA Δ5 1  27CCTCATGCAGG (SEQ ID NO: 323) ACTTCCAGGCTCCACCCGAC--GTGCTTGAGGGCGAACCCCAΔ2 + 6 2   5 AGGGGACCTCA (SEQ ID NO: 324)ACTTCCAGGCTCCACCCGACT------CACTATCTTCTGGGC Δ6 + 25 2  21, 114TCCTCCATGTC (SEQ ID NO: 325) ACTTCCAGGCTCCACCCGACT----TGGCGAACCCCAAGGGGΔ4 + 1 1  53 ACCTCATGCAG (SEQ ID NO: 326)ACTTCCAGGCTCCACCCGACT--TGCAGGGCGAACCCCAAGG Δ2 + 3 1 126GGACCTCATGC (SEQ ID NO: 327) ACTTCCAGGCTCCACCCGACTTGGAGGGCGAACCCCAAGGGG+1 15   3, 5, 12, ACCTCATGCAG (SEQ ID NO: 328)  19, 29,  55, 56, 61, 66,  68, 81, 108, 111, 124, 127ACTTCCAGGCTCCACCCGACTTTGGAGGGCGAACCCCAAGGG +2 2  79, 120GACCTCATGCA (SEQ ID NO: 329) ACTTCCAGGCTCCACCCGACTGTTGGAGGGCGAACCCCAAGG+3 1  55 GGACCTCATGC (SEQ ID NO: 330)ACTTCCAGGCTCCACCCGACTGGAG(+455)GGCGAACCCCA +455 1  13AGGGGACCTCC (SEQ ID NO: 331)

To generate Prkdc-targeted mice, we applied a 5-fold higherconcentration of Cas9 mRNA (50 ng/μl) with increasing doses ofPrkdc-sgRNA (50, 100, and 250 ng/μl). Again, the birth rates were veryhigh, ranging from 51% to 60%, enough to produce a sufficient number ofnewborns for the analysis (Table 6). The mutant fraction was 57% (21mutant founders among 37 newborns) at the maximum dose of Prkdc-sgRNA.These birth rates obtained with RGENs were approximately 2- to 10-foldhigher than those with TALENs reported in our previous study (Sung etal., 2013). These results demonstrate that RGENs are potentgene-targeting reagents with minimal toxicity.

To test the germ-line transmission of the mutant alleles, we crossed theFoxn1 mutant founder #108, a mosaic with four different alleles (FIG. 5c , and Table 8) with wild-type mice, and monitored the genotypes of F1offspring.

TABLE 8 Genotypes of Foxn1 mutant mice Founder sgRNA Genotyping DetectedNO. (ng/ml) Summary alleles 58* 1 not determined Δ11 19 100 bi-allelicΔ60/+1 20 100 bi-allelic Δ67/Δ19 13 100 bi-allelic Δ18/+455 32 10bi-allelic Δ13/Δ15 + 1 (heterozygote) 115 10 bi-allelic Δ18/Δ5(heterozygote) 111 10 bi-allelic Δ11/+1 (heterozygote) 110 10 bi-allelicΔ8/Δ8 (homozygote) 120 10 bi-allelic +2/+2 (homozygote) 81 100heterozygote +1/WT 69 100 homozygote Δ11/Δ11 55 1 mosaic Δ18/Δ1/+1/+3 561 mosaic Δ127/Δ41/Δ2/+1 127 1 mosaic Δ18/+1/WT 53 1 mosaic Δ11/Δ5/Δ4 +1/WT 27 10 mosaic Δ17/Δ5/WT 29 10 mosaic Δ18/Δ20 + 1/+1 95 10 mosaicΔ18/Δ14/Δ8/Δ4 108 10 mosaic +1/Δ17/Δ23/Δ44 114 10 mosaic Δ17/08/Δ6 + 25124 10 mosaic Δ11/Δ15 + 2/+1 126 10 mosaic Δ17/Δ2 + 3/Δ12 + 6 12 100mosaic Δ30/Δ28/Δ17/+1 5 100 mosaic Δ28/Δ11/Δ2 + 6/+1 14 100 mosaicΔ17/Δ11/Δ10 21 100 mosaic Δ127/Δ41/Δ2/Δ6 + 25 24 100 mosaic Δ17/+1/WT 64100 mosaic Δ31/Δ21/+1/WT 68 100 mosaic Δ17/Δ11/+1/WT 79 100 mosaicΔ22/Δ5/+2/WT 61 100 mosaic Δ21 + 4/Δ6/+1/+9 66** 100 mosaic Δ17/Δ8/Δ11 +6/+1/WT 3 100 mosaic Δ11/Δ8/+1 Underlined alleles were sequenced.Alleles in red, detected by sequencing, but not by fPCR. *only one clonesequenced. **Not determined by fPCR.

As expected, all the progenies were heterozygous mutants possessing thewild-type allele and one of the mutant alleles (FIG. 5 d ). We alsoconfirmed the germ-line transmission in independent founder mice ofFoxn1 (FIG. 8 ) and Prkdc (FIG. 9 ). To the best of our knowledge, theseresults provide the first evidence that RGEN-induced mutant alleles arestably transmitted to F1 progenies in animals.

Example 4: RNA-Guided Genome Editing in Plants

4-1. Production of Cas9 Protein

The Cas9 coding sequence (4104 bps), derived from Streptococcus pyogenesstrain M1 GAS (NC_002737.1), was cloned to pET28-b(+) plasmid. A nucleartargeting sequence (NLS) was included at the protein N terminus toensure the localization of the protein to the nucleus. pET28-b(+)plasmid containing Cas9 ORF was transformed into BL21(DE3). Cas9 wasthen induced using 0.2 mM IPTG for 16 hrs at 18° C. and purified usingNi-NTA agarose beads (Qiagen) following the manufacturer's instructions.Purified Cas9 protein was concentrated using Ultracel—100K (Millipore).

4-2. Production of guide RNA

The genomic sequence of the Arabidopsis gene encoding the BRI1 wasscreened for the presence of a NGG motif, the so called protospaceradjacent motif (PAM), in an exon which is required for Cas9 targeting Todisrupt the BRI1 gene in Arabidopsis, we identified two RGEN targetsites in an exon that contain the NGG motif. sgRNAs were produced invitro using template DNA. Each template DNA was generated by extensionwith two partially overlapped oligonucleotides (Macrogen, Table X1) andPhusion polymerase (Thermo Scientific) using the followingconditions—98° C. 30 sec {98° C. 10 sec, 54° C. 20 sec, 72° C. 2min}×20, 72° C. 5 min.

TABLE 9Oligonucleotides for the production of the template DNA for in vitrotranscription Oligonucleotides Sequence (5′-3′) SEQ ID NO BRI1 target 1GAAATTAATACGACTCACTATAGGTTTGAAAGAT 73 (Forward)GGAAGCGCGGGTTTTAGAGCTAGAAATAGCAAGT TAAAATAAGGCTAGTCCG BRI1 target 2GAAATTAATACGACTCACTATAGGTGAAACTAAA 74 (Forward)CTGGTCCACAGTTTTAGAGCTAGAAATAGCAAGT TAAAATAAGGCTAGTCCG UniversalAAAAAAGCACCGACTCGGTGCCACTTTTTCAAGT 75 (Reverse)TGATAACGGACTAGCCTTATTTTAACTTGC

The extended DNA was purified and used as a template for the in vitroproduction of the guide RNA's using the MEGAshortscript T7 kit (LifeTechnologies). Guide RNA were then purified by Phenol/Chloroformextraction and ethanol precipitation. To prepare Cas9/sgRNA complexes,10 μl of purified Cas9 protein (12 μg/μl) and 4 μl each of two sgRNAs(11 μg/μl) were mixed in 20 μl NEB3 buffer (New England Biolabs) andincubated for 10 min at 37° C.

4-3. Transfection of Cas9/sgRNA Complex to Protoplast

The leaves of 4-week-old Arabidopsis seedlings grown aseptically inpetri dishes were digested in enzyme solution (1% cellulose R10, 0.5%macerozyme R10, 450 mM mannitol, 20 mM MES pH 5.7 and CPW salt) for 8-16hrs at 25° C. with 40 rpm shaking in the dark. Enzyme/protoplastsolutions were filtered and centrifuged at 100×g for 3-5 min.Protoplasts were re-suspended in CPW solution after counting cells underthe microscope (×100) using a hemacytometer. Finally, protoplasts werere-suspended at 1×10⁶/ml in MMG solution (4 mM HEPES pH 5.7, 400 mMmannitol and 15 mM MgCl₂). To transfect the protoplasts with Cas9/sgRNAcomplex, 200 μL (200,000 protoplasts) of the protoplast suspension weregently mixed with 3.3 or 10 μL of Cas9/sgRNA complex [Cas9 protein (6μg/μL) and two sgRNAs (2.2 μg/μL each)] and 200 μL of 40% polyethyleneglycol transfection buffer (40% PEG4000, 200 mM mannitol and 100 mMCaCl₂)) in 2 ml tubes. After 5-20 min incubation at room temperature,transfection was stopped by adding wash buffer with W5 solution (2 mMMES pH 5.7, 154 mM NaCl, 125 mM CaCl₂) and 5 mM KCl). Protoplasts werethen collected by centrifugation for 5 min at 100×g, washed with 1 ml ofW5 solution, centrifuged for another 5 min at 100×g. The density ofprotoplasts was adjusted to 1×10⁵/ml and they were cultured in modifiedKM 8p liquid medium with 400 mM glucose.

4-4. Detection of Mutations in Arabidopsis Protoplasts and Plants

After 24 hr or 72 hr post-transfection, protoplasts were collected andgenomic DNA was isolated. The genomic DNA region spanning the two targetsites was PCR-amplified and subjected to the T7E1 assay. As shown inFIG. 11 , indels were induced by RGENs at high frequencies that rangedfrom 50% to 70%. Surprisingly, mutations were induced at 24 hrpost-transfection. Apparently, Cas9 protein functions immediately aftertransfection. PCR products were purified and cloned into T-Blunt PCRCloning Kit (Solgent). Plasmids were purified and subjected to Sangersequencing with M13F primer. One mutant sequence had a 7-bp deletion atone site (FIG. 12 ). The other three mutant sequences had deletions of˜220-bp DNA segments between the two RGEN site.

Example 5: Cas9 Protein Transduction Using a Cell-Penetrating Peptide orProtein Transduction Domain

5-1. Construction of his-Cas9-Encoding Plasmid

Cas9 with a cysteine at the C-terminal was prepared by PCR amplificationusing the previously described Cas9 plasmid {Cho, 2013 #166} as thetemplate and cloned into pET28-(a) vector (Novagen, Merk Millipore,Germany) containing His-tag at the N-terminus.

5-2. Cell Culture

293T (Human embryonic kidney cell line), and HeLa (human ovarian cancercell line) were grown in DMEM (GIBCO-BRL Rockville) supplemented with10% FBS and 1% penicillin and streptomycin.

5-3. Expression and Purification of Cas9 Protein

To express the Cas9 protein, E. coli BL21 cells were transformed withthepET28-(a) vector encoding Cas9 and plated onto Luria-Bertani (LB)agar medium containing 50 μg/mL kanamycin (Amresco, Solon, OH). Nextday, a single colony was picked and cultured in LB broth containing 50μg/mL kanamycin at 37° C. overnight. Following day, this starter cultureat 0.1 OD600 was inoculated into Luria broth containing 50 μg/mLkanamycin and incubated for 2 hrs at 37° C. until OD600 reached to0.6-0.8. To induce Cas9 protein expression, the cells were cultured at30° C. overnight after addition of isopropyl-3-D-thiogalactopyranoside(IPTG) (Promega, Madison, WI) to the final concentration of 0.5 mM.

The cells were collected by centrifugation at 4000 rpm for 15-20 mins,resuspended in a lysis buffer (20 mM Tris-Cl pH8.0, 300 mM NaCl, 20 mMimidazole, 1× protease inhibitor cocktail, 1 mg/ml lysozyme), and lysedby sonication (40% duty, 10 sec pulse, 30 sec rest, for 10 mins on ice).The soluble fraction was separated as the supernatant aftercentrifugation at 15,000 rpm for 20 mins at 4° C. Cas9 protein waspurified at 4° C. using a column containing Ni-NTA agarose resin(QIAGEN) and AKTA prime instrument (AKTA prime, GE Healthcare, UK).During this chromatography step, soluble protein fractions were loadedonto Ni-NTA agarose resin column (GE Healthcare, UK) at the flow rate of1 mL/min. The column was washed with a washing buffer (20 mM Tris-ClpH8.0, 300 mM NaCl, 20 mM imidazole, 1× protease inhibitor cocktail) andthe bound protein was eluted at the flow rate of 0.5 ml/min with anelution buffer (20 mM Tris-Cl pH8.0, 300 mM NaCl, 250 mM imidazole, 1×protease inhibitor cocktail). The pooled eluted fraction wasconcentrated and dialyzed against storage buffer (50 mM Tris-HCl, pH8.0,200 mM KCl, 0.1 mM EDTA, 1 mM DTT, 0.5 mM PMSF, 20% Glycerol). Proteinconcentration was quantitated by Bradford assay (Biorad, Hercules, CA)and purity was analyzed by SDS-PAGE using bovine serum albumin as thecontrol.

5-4. Conjugation of Cas9 to 9R4L

1 mg Cas9 protein diluted in PBS at the concentration of 1 mg/mL and 50μg of maleimide-9R4L peptide in 25 μL DW (Peptron, Korea) were gentlymixed using a rotor at room temperature for 2 hrs and at 4° C.overnight. To remove unconjugated maleimide-9R4L, the samples weredialyzed using 50 kDa molecular weight cutoff membrane against of DPBS(pH 7.4) at 4° C. for 24 hrs. Cas9-9R4L protein was collected from thedialysis membrane and the protein amount was determined using Bradfordassay.

5-5. Preparation of sgRNA-9R4L

sgRNA (1 μg) was gently added to various amounts of C9R4LC peptide(ranging from 1 to 40 weight ratio) in 100 μl of DPBS (pH 7.4). Thismixture was incubated at room temperature for 30 mins and diluted to 10folds using RNAse-free deionized water. The hydrodynamic diameter andz-potential of the formed nanoparticles were measured using dynamiclight scattering (Zetasizer-nano analyzer ZS; Malvern instruments,Worcestershire, UK).

5-6. Cas9 Protein and sgRNA Treatments

Cas9-9R4L and sgRNA-C9R4LC were treated to the cells as follows: 1 μg ofsgRNA and 15 μg of C9R4LC peptide were added to 250 mL of OPTIMEM mediumand incubated at room temperature for 30 mins. At 24 hrs after seeding,cells were washed with OPTIMEM medium and treated with sgRNA-C9R4LCcomplex for 4 hrs at 37° C. Cells were washed again with OPTIMEM mediumand treated with Cas9-9R4L for 2 hrs at 37° C. After treatment, culturemedia was replaced with serum-containing complete medium and incubatedat 37° C. for 24 hrs before the next treatment. Same procedure wasfollowed for multiple treatments of Cas9 and sgRNA for three consecutivedays.

5-7. Cas9-9R4L and sgRNA-9R4L can Edit Endogenous Genes in CulturedMammalian Cells without the Use of Additional Delivery Tools

To determine whether Cas9-9R4L and sgRNA-9R4L can edit endogenous genesin cultured mammalian cells without the use of additional deliverytools, we treated 293 cells with Cas9-9R4L and sgRNA-9R4L targeting theCCR5 gene and analyzed the genomic DNA. T7E1 assay showed that 9% ofCCR5 gene was disrupted in cells treated with both Cas9-9R4L andsgRNA-9R4L and that the CCR5 gene disruption was not observed in controlcells including those untreated, treated with either Cas9-9R orsgRNA-9R4L, or treated with both unmodified Cas-9 and sgRNA (FIG. 13 ),suggesting that the treatment with Cas9-9R4L protein and sgRNAconjugated with 9R4L, but not unmodified Cas9 and sgRNA, can lead toefficient genome editing in mammalian cells.

Example 6: Control of Off-Target Mutation According to Guide RNAStructure

Recently, three groups reported that RGENs had off-target effects inhuman cells. To our surprise, RGENs induced mutations efficiently atoff-target sites that differ by 3 to 5 nucleotides from on-target sites.We noticed, however, that there were several differences between ourRGENs and those used by others. First, we used dual RNA, which is crRNAplus tracrRNA, rather than single-guide RNA (sgRNA) that is composed ofessential portions of crRNA and tracrRNA. Second, we transfected K562cells (but not HeLa cells) with synthetic crRNA rather than plasmidsencoding crRNA. HeLa cells were transfected with crRNA-encodingplasmids. Other groups used sgRNA-encoding plasmids. Third, our guideRNA had two additional guanine nucleotides at the 5′ end, which arerequired for efficient transcription by T7 polymerase in vitro. No suchadditional nucleotides were included in the sgRNA used by others. Thus,the RNA sequence of our guide RNA can be shown as 5′-GGX20, whereas5′-GX₁₉, in which X20 or GX19 corresponds to the 20-bp target sequence,represents the sequence used by others. The first guanine nucleotide isrequired for transcription by RNA polymerase in cells. To test whetheroff-target RGEN effects can be attributed to these differences, we chosefour RGENs that induced off-target mutations in human cells at highfrequencies (13). First, we compared our method of using in vitrotranscribed dual RNA with the method of transfecting sgRNA-encodingplasmids in K562 cells and measured mutation frequencies at theon-target and off-target sites via the T7E1 assay. Three RGENs showedcomparable mutation frequencies at on-target and off-target sitesregardless of the composition of guide RNA. Interestingly, one RGEN(VEFGA site 1) did not induce indels at one validated off-target site,which differs by three nucleotides from the on-target site (termedOT1-11, FIG. 14 ), when synthetic dual RNA was used. But the syntheticdual RNA did not discriminate the other validated off-target site(OT1-3), which differs by two nucleotides from the on-target site.

Next, we tested whether the addition of two guanine nucleotides at the5′ end of sgRNA could make RGENs more specific by comparing 5′-GGX₂₀ (or5′-GGGX₁₉) sgRNA with 5′-GX₁₉ sgRNA. Four GX₁₉ sgRNAs complexed withCas9 induced indels equally efficiently at on-target and off-targetsites, tolerating up to four nucleotide mismatches. In sharp contrast,GGX₂₀ sgRNAs discriminated off-target sites effectively. In fact, theT7E1 assay barely detected RGEN-induced indels at six out of the sevenvalidated off-target sites when we used the four GGX₂₀ sgRNAs (FIG. 15). We noticed, however, that two GGX₂₀ sgRNAs (VEGFA sites 1 and 3) wereless active at on-target sites than were the corresponding GX₁₉ sgRNAs.These results show that the extra nucleotides at the 5′ end can affectmutation frequencies at on-target and off-target sites, perhaps byaltering guide RNA stability, concentration, or secondary structure.

These results suggest that three factors—the use of synthetic guide RNArather than guide RNA-encoding plasmids, dual RNA rather than sgRNA, andGGX₂₀ sgRNA rather than GX₁₉ sgRNA—have cumulative effects on thediscrimination of off-target sites.

Example 7: Paired Cas9 Nickases

In principle, single-strand breaks (SSBs) cannot be repaired byerror-prone NHEJ but still trigger high fidelity homology-directedrepair (HDR) or base excision repair. But nickase-induced targetedmutagenesis via HDR is much less efficient than is nuclease-inducedmutagenesis. We reasoned that paired Cas9 nickases would producecomposite DSBs, which trigger DNA repair via NHEJ or HDR, leading toefficient mutagenesis (FIG. 16A). Furthermore, paired nickases woulddouble the specificity of Cas9-based genome editing.

We first tested several Cas9 nucleases and nickases designed to targetsites in the AAVS1 locus (FIG. 16B) in vitro via fluorescent capillaryelectrophoresis. Unlike Cas9 nucleases that cleaved both strands of DNAsubstrates, Cas9 nickases composed of guide RNA and a mutant form ofCas9 in which a catalytic aspartate residue is changed to an alanine(D10A Cas9) cleaved only one strand, producing site-specific nicks (FIG.16C, D). Interestingly, however, some nickases (AS1, AS2, AS3, and S6 inFIG. 17A) induced indels at target sites in human cells, suggesting thatnicks can be converted to DSBs, albeit inefficiently, in vivo. PairedCas9 nickases producing two adjacent nicks on opposite DNA strandsyielded indels at frequencies that ranged from 14% to 91%, comparable tothe effects of paired nucleases (FIG. 17A). The repair of two nicks thatwould produce 5′ overhangs led to the formation of indels much morefrequently than those producing 3′ overhangs at three genomic loci (FIG.17A and FIG. 18 ). In addition, paired nickases enabled targeted genomeediting via homology-directed repair more efficiently than did singlenickases (FIG. 19 ).

We next measured mutation frequencies of paired nickases and nucleasesat off-target sites using deep sequencing. Cas9 nucleases complexed withthree sgRNAs induced off-target mutations at six sites that differ byone or two nucleotides from their corresponding on-target sites withfrequencies that ranged from 0.5% to 10% (FIG. 17B). In contrast, pairedCas9 nickases did not produce indels above the detection limit of 0.1%at any of the six off-target sites. The S2 Off-1 site that differs by asingle nucleotide at the first position in the PAM (i.e., N in NGG) fromits on-target site can be considered as another on-target site. Asexpected, the Cas9 nuclease complexed with the S2 sgRNA was equallyefficient at this site and the on-target site. In sharp contrast, D10ACas9 complexed with the S2 and AS2 sgRNAs discriminated this site fromthe on-target site by a factor of 270 fold. This paired nickase alsodiscriminated the AS2 off-target sites (Off-1 and Off-9 in FIG. 17B)from the on-target site by factors of 160 fold and 990 fold,respectively.

Example 8: Chromosomal DNA Splicing Induced by Paired Cas9 Nickases

Two concurrent DSBs produced by engineered nucleases such as ZFNs andTALENs can promote large deletions of the intervening chromosomalsegments has been reported. We tested whether two SSBs induced by pairedCas9 nickases can also produce deletions in human cells. We used PCR todetect deletion events and found that seven paired nickases induceddeletions of up to 1.1-kbp chromosomal segments as efficiently as pairedCas9 nucleases did (FIG. 20A,B). DNA sequences of the PCR productsconfirmed the deletion events (FIG. 20C). Interestingly, thesgRNA-matching sequence remained intact in two out of sevendeletion-specific PCR amplicons (underlined in FIG. 20C). In contrast,Cas9 nuclease pairs did not produce sequences that contained intacttarget sites. This finding suggests that two distant nicks were notconverted to two separate DSBs to promote deletions of the interveningchromosomal segment. In addition, it is unlikely that two nicksseparated by more than a 100 bp can produce a composite DSB with largeoverhangs under physiological conditions because the melting temperatureis very high.

We propose that two distant nicks are repaired by strand displacement ina head-to-head direction, resulting in the formation of a DSB in themiddle, whose repair via NHEJ causes small deletions (FIG. 20D). Becausethe two target sites remain intact during this process, nickases caninduce SSBs again, triggering the cycle repeatedly until the targetsites are deleted. This mechanism explains why two offset nicksproducing 5′ overhangs but not those producing 3′ overhangs inducedindels efficiently at three loci.

We then investigated whether Cas9 nucleases and nickases can induceunwanted chromosomal translocations that result from NHEJ repair ofon-target and off-target DNA cleavages (FIG. 21A). We were able todetect translocations induced by Cas9 nucleases using PCR (FIG. 21B, C).No such PCR products were amplified using genomic DNA isolated fromcells transfected with the plasmids encoding the AS2+S3 Cas9 nickasepair. This result is in line with the fact that both AS2 and S3nickases, unlike their corresponding nucleases, did not produce indelsat off-target sites (FIG. 17B).

These results suggest that paired Cas9 nickases allow targetedmutagenesis and large deletions of up to 1-kbp chromosomal segments inhuman cells. Importantly, paired nickases did not induce indels atoff-target sites at which their corresponding nucleases inducemutations. Furthermore, unlike nucleases, paired nickases did notpromote unwanted translocations associated with off-target DNAcleavages. In principle, paired nickases double the specificity ofCas9-mediated mutagenesis and will broaden the utility of RNA-guidedenzymes in applications that require precise genome editing such as geneand cell therapy. One caveat to this approach is that two highly activesgRNAs are needed to make an efficient nickase pair, limiting targetablesites. As shown in this and other studies, not all sgRNAs are equallyactive. When single clones rather than populations of cells are used forfurther studies or applications, the choice of guide RNAs that representunique sequences in the genome and the use of optimized guide RNAs wouldsuffice to avoid off-target mutations associated with Cas9 nucleases. Wepropose that both Cas9 nucleases and paired nickases are powerfuloptions that will facilitate precision genome editing in cells andorganisms.

Example 9: Genotyping with CRISPR/Cas-Derived RNA-Guided Endonucleases

Next, we reasoned that RGENs can be used in Restriction fragment lengthpolymorphism (RFLP) analysis, replacing conventional restrictionenzymes. Engineered nucleases including RGENs induce indels at targetsites, when the DSBs caused by the nucleases are repaired by theerror-prone non-homologous end-joining (NHEJ) system. RGENs that aredesigned to recognize the target sequences cannot cleave mutantsequences with indels but will cleave wildtype target sequencesefficiently.

9-1. RGEN Components

crRNA and tracrRNA were prepared by in vitro transcription using MEGAshortcript T7 kit (Ambion) according to the manufacturer's instruction.Transcribed RNAs were resolved on a 8% denaturing urea-PAGE gel. The gelslice containing RNA was cut out and transferred to elution buffer. RNAwas recovered in nuclease-free water followed by phenol:chloroformextraction, chloroform extraction, and ethanol precipitation. PurifiedRNA was quantified by spectrometry. Templates for crRNA were prepared byannealing an oligonucleotide whose sequence is shown as5′-GAAATTAATACGACTCACTATAGGX₂₀GTTTTAGAGCTATGCTGTTTTG-3′ (SEQ ID NO: 76),in which X₂₀ is the target sequence, and its complementaryoligonucleotide. The template for tracrRNA was synthesized by extensionof forward and reverse oligonucleotides(5′-GAAATTAATACGACTCACTATAGGAACCATTCAAAACAGCATAGCAAGTTAAAATAAGGCTAGTCCG-3′ (SEQ ID NO: 77) and5′-AAAAAAAGCACCGACTCGGTGCCACTTTTTCAAGTTGATAACGGACTAGCCTTATTTTAACTTGCTATG-3′ (SEQ ID NO: 78)) using Phusion polymerase (New EnglandBioLabs).

9-2. Recombinant Cas9 Protein Purification

The Cas9 DNA construct used in our previous Example, which encodes Cas9fused to the His6-tag at the C terminus, was inserted in the pET-28aexpression vector. The recombinant Cas9 protein was expressed in E. colistrain BL21 (DE3) cultured in LB medium at 25° C. for 4 hours afterinduction with 1 mM IPTG. Cells were harvested and resuspended in buffercontaining 20 mM Tris PH8.0, 500 mM NaCl, 5 mM imidazole, and 1 mM PMSF.Cells were frozen in liquid nitrogen, thawed at 4° C., and sonicated.After centrifugation, the Cas9 protein in the lysate was bound to Ni-NTAagarose resin (Qiagen), washed with buffer containing 20 mM Tris pH 8.0,500 mM NaCl, and 20 mM imidazole, and eluted with buffer containing 20mM Tris pH 8.0, 500 mM NaCl, and 250 mM imidazole. Purified Cas9 proteinwas dialyzed against 20 mM HEPES (pH 7.5), 150 mM KCl, 1 mM DTT, and 10%glycerol and analyzed by SDS-PAGE.

9-3. T7 Endonuclease I Assay

The T7E1 assay was performed as following. In brief, PCR productsamplified using genomic DNA were denatured at 95° C., reannealed at 16°C., and incubated with 5 units of T7 Endonuclease I (New EnglandBioLabs) for 20 min at 37° C. The reaction products were resolved using2 to 2.5% agarose gel electrophoresis.

9-4. RGEN-RFLP Assay

PCR products (100-150 ng) were incubated for 60 min at 37° C. withoptimized concentrations (Table 10) of Cas9 protein, tracrRNA, crRNA in10 μl NEB buffer 3 (1×). After the cleavage reaction, RNase A (4 μg) wasadded, and the reaction mixture was incubated for 30 min at 37° C. toremove RNA. Reactions were stopped with 6× stop solution buffercontaining 30% glycerol, 1.2% SDS, and 100 mM EDTA. Products wereresolved with 1-2.5% agarose gel electrophoresis and visualized withEtBr staining.

TABLE 10 Concentration of RGEN components in RFLP assays tracrRNA TargetName Cas9 (ng/μl) crRNA (ng/μl) (ng/μl) C4BPB 100 25 60 PIBF-NGG-RGEN100 25 60 HLA-B 1.2 0.3 0.7 CCR5-ZFN 100 25 60 CTNNB1 Wild type 30 10 20specific CTNNB1 mutant specific 30 10 20 CCR5 WT-specific 100 25 60 CCR5Δ32-specific 10 2.5 6 KRAS WT specific (wt) 30 10 20 KRAS mutant 30 1020 specific (m8) KRAS WT specific (m6) 30 10 20 KRAS mutant specific 3010 20 (m6,8) PIK3CA WT specific (wt) 100 25 60 PIK3CA mutant 30 10 20specific (m4) PIK3CA WT specific (m7) 100 25 60 PIK3CA mutant 30 10 20specific (m4,7) BRAF WT-specific 30 10 20 BRAF mutant-specific 100 25 60NRAS WT-specific 100 25 60 NRAS mutant-specific 30 10 20 IDH WT-specific30 10 20 IDH mutant-specific 30 10 20 PIBF-NAG-RGEN 30 10 60

TABLE 11 Primers SEQ Gene Direc- ID (site) tion Sequence (5′ to 3′) NOCCR5 F1 CTCCATGGTGCTATAGAGCA 79 (RGEN) F2 GAGCCAAGCTCTCCATCTAGT 80 RGCCCTGTCAAGAGTTGACAC 81 CCR5 F GCACAGGGTGGAACAAGATGGA 82 (ZFN) RGCCAGGTACCTATCGATTGTCAGG 83 CCR5 F GAGCCAAGCTCTCCATCTAGT 84 (del32) RACTCTGACTG GGTCACCAGC 85 C4BPB F1 TATTTGGCTGGTTGAAAGGG 86 R1AAAGTCATGAAATAAACACACCCA 87 F2 CTGCATTGATATGGTAGTACCATG 88 R2GCTGTTCATTGCAATGGAATG 89 CTNNB1 F ATGGAGTTGGACATGGCCATGG 90 RACTCACTATCCACAGTTCAGCATTTACC 91 KRAS F TGGAGATAGCTGTCAGCAACTTT 92 RCAACAA AGCAAAGGTAAAGTTGGTAATAG 93 PIK3CA F GGTTTCAGGAGATGTGTTACAAGGC 94R GATTGTGCAATTCCTATGCAATCGGTC 95 NRAS F CACTGGGTACTTAATCTGTAGCCTC 96 RGGTTCCAAGTCATTCCCAGTAGC 97 IDH1 F CATCACTGCAGTTGTAGGTTATAACTATCC 98 RTTGAAAACCACAGATCTGGTTGAACC 99 BRAF F GGAGTGCCAAGAGAATATCTGG 100 RCTGAAACTGGTTTCAAAATATTCGTTTTAAGG 101 PIBF F GCTCTGTATGCCCTGTAGTAGG 102 RTTTGCATCTGACCTTACCTTTG 103

9-5. Plasmid Cleavage Assay

Restriction enzyme-treated linearized plasmid (100 ng) was incubated for60 min at 37° C. with Cas9 protein (0.1 μg), tracrRNA (60 ng), and crRNA(25 ng) in 10 μl NEB 3 buffer (1×). Reactions were stopped with 6× stopsolution containing 30% glycerol, 1.2% SDS, and 100 mM EDTA. Productswere resolved with 1% agarose gel electrophoresis and visualized withEtBr staining.

9-6. Strategy of RFLP

New RGENs with desired DNA specificities can be readily created byreplacing crRNA; no de novo purification of custom proteins is requiredonce recombinant Cas9 protein is available. Engineered nucleases,including RGENs, induce small insertions or deletions (indels) at targetsites when the DSBs caused by the nucleases are repaired by error-pronenon-homologous end-joining (NHEJ). RGENs that are designed to recognizethe target sequences cleave wild-type sequences efficiently but cannotcleave mutant sequences with indels (FIG. 22 ).

We first tested whether RGENs can differentially cleave plasmids thatcontain wild-type or modified C4BPB target sequences that harbor 1- to3-base indels at the cleavage site. None of the six plasmids with theseindels were cleaved by a C4BPB-specific RGENS composed oftarget-specific crRNA, tracrRNA, and recombinant Cas9 protein (FIG. 23). In contrast, the plasmid with the intact target sequence was cleavedefficiently by this RGEN.

9-7. Detection of Mutations Induced by the Same RGENs UsingRGEN-Mediated RFLP

Next, to test the feasibility of RGEN-mediated RFLP for detection ofmutations induced by the same RGENs, we utilized gene-modified K562human cancer cell clones established using an RGEN targeting C4BPB gene(Table 12).

TABLE 12 Target sequence of RGENs used in this study GeneTarget sequence SEQ ID NO human C4BPB AATGACCACTACATCCTCAAGGG 104mouse Pibf1 AGATGATGTCTCATCATCAGAGG 105

C4BPB mutant clones used in this study have various mutations rangingfrom 94 by deletion to 67 by insertion (FIG. 24A). Importantly, allmutations occurred in mutant clones resulted in the loss of RGEN targetsite. Among 6 C4BPB clones analyzed, 4 clones have both wildtype andmutant alleles (+/−) and 2 clones have only mutant alleles (−/−).

The PCR products spanning the RGEN target site amplified from wildtypeK562 genomic DNA were digested completely by the RGEN composed oftarget-specific crRNA, tracrRNA, and recombinant Cas9 protein expressedin and purified from E. coli (FIG. 24B/Lane 1). When the C4BPB mutantclones were subjected to RFLP analysis using the RGEN, PCR amplicons of+/−clones that contained both wildtype and mutant alleles were partiallydigested, and those of −/− cloned that did not contain the wildtypeallele were not digested at all, yielding no cleavage productscorresponding to the wildtype sequence (FIG. 24B). Even a single-baseinsertion at the target site blocked the digestion (#12 and #28 clones)of amplified mutant alleles by the C4BPB RGEN, showing the highspecificity of RGEN-mediated RFLP. We subjected the PCR amplicons to themismatch-sensitive T7E1 assay in parallel (FIG. 24B). Notably, the T7E1assay was not able to distinguish −/− clones from +/−clones. To make itmatters worse, the T7E1 assay cannot distinguish homozygous mutantclones that contain the same mutant sequence from wildtype clones,because annealing of the same mutant sequence will forma homoduplex.Thus, RGEN-mediated RFLP has a critical advantage over the conventionalmismatch-sensitive nuclease assay in the analysis of mutant clonesinduced by engineered nucleases including ZFNs, TALENs and RGENs.

9-8. Quantitative Assay for RGEN-RFLP Analysis

We also investigated whether RGEN-RFLP analysis is a quantitativemethod. Genomic DNA samples isolated from the C4BPB null clone and thewild-type cells were mixed at various ratios and used for PCRamplifications. The PCR products were subjected to RGEN genotyping andthe T7E1 assay in parallel (FIG. 25 b ). As expected, DNA cleavage bythe RGEN was proportional to the wild type to mutant ratio. In contrast,results of the T7E1 assay correlated poorly with mutation frequenciesinferred from the ratios and were inaccurate, especially at high mutant%, a situation in which complementary mutant sequences can hybridizewith each other to form homoduplexes.

9-9. Analysis of Mutant Mouse Founders Using a RGEN-Mediated RFLPGenotyping

We also applied RGEN-mediated RFLP genotyping (RGEN genotyping in short)to the analysis of mutant mouse founders that had been established byinjection of TALENs into mouse one-cell embryos (FIG. 26A). We designedand used an RGEN that recognized the TALEN target site in the Pibf1 gene(Table 10). Genomic DNA was isolated from a wildtype mouse and mutantmice and subjected to RGEN genotyping after PCR amplification. RGENgenotyping successfully detected various mutations, which ranged fromone to 27-bp deletions (FIG. 26B). Unlike the T7E1 assay, RGENgenotyping enabled differential detection of +/− and −/− founder.

9-10. Detection of Mutations Induced in Human Cells by a CCR5-SpecificZFN Using RGENs

In addition, we used RGENs to detect mutations induced in human cells bya CCR5-specific ZFN, representing yet another class of engineerednucleases (FIG. 27 ). These results show that RGENs can detect mutationsinduced by nucleases other than RGENs themselves. In fact, we expectthat RGENs can be designed to detect mutations induced by most, if notall, engineered nucleases. The only limitation in the design of an RGENgenotyping assay is the requirement for the GG or AG (CC or CT on thecomplementary strand) dinucleotide in the PAM sequence recognized by theCas9 protein, which occurs once per 4 bp on average. Indels inducedanywhere within the seed region of several bases in crRNA and the PAMnucleotides are expected to disrupt RGEN-catalyzed DNA cleavage. Indeed,we identified at least one RGEN site in most (98%) of the ZFN and TALENsites.

9-11. Detection of Polymorphisms or Variations Using RGEN

Next, we designed and tested a new RGEN that targets a highlypolymorphic locus, HLA-B, that encodes Human Leukocyte Antigen B (a.k.a.MHC class I protein) (FIG. 28 ). HeLa cells were transfected with RGENplasmids, and the genomic DNA was subjected to T7E1 and RGEN-RFLPanalyses in parallel. T7E1 produced false positive bands that resultedfrom sequence polymorphisms near the target site (FIG. 25 c ). Asexpected, however, the same RGEN used for gene disruption cleaved PCRproducts from wild-type cells completely but those from RGEN-transfectedcells partially, indicating the presence of RGEN-induced indels at thetarget site. This result shows that RGEN-RFLP analysis has a clearadvantage over the T7E1 assay, especially when it is not known whethertarget genes have polymorphisms or variations in cells of interest.

9-12. Detection of Recurrent Mutations Found in Cancer andNaturally-Occurring Polymorphisms Through RGEN-RFLP Analysis

RGEN-RFLP analysis has applications beyond genotyping of engineerednuclease-induced mutations. We sought to use RGEN genotyping to detectrecurrent mutations found in cancer and naturally-occurringpolymorphisms. We chose the human colorectal cancer cell line, HCT116,which carries a gain-of-function 3-bp deletion in the oncogenic CTNNB1gene encoding beta-catenin. PCR products amplified from HCT116 genomicDNA were cleaved partially by both wild-type-specific andmutant-specific RGENs, in line with the heterozygous genotype in HCT116cells (FIG. 29 a ). In sharp contrast, PCR products amplified from DNAfrom HeLa cells harboring only wild-type alleles were digestedcompletely by the wild-type-specific RGEN and were not cleaved at all bythe mutation-specific RGEN.

We also noted that HEK293 cells harbor the 32-bp deletion (del32) in theCCR5 gene, which encodes an essential co-receptor of HIV infection:Homozygous del32 CCR5 carriers are immune to HIV infection. We designedone RGEN specific to the del32 allele and the other to the wild-typeallele. As expected, the wild-type-specific RGEN cleaved the PCRproducts obtained from K562, SKBR3, or HeLa cells (used as wild-typecontrols) completely but those from HEK293 cells partially (FIG. 30 a ),confirming the presence of the uncleavable del32 allele in HEK293 cells.Unexpectedly, however, the del32-specific RGEN cleaved the PCR productsfrom wild-type cells as efficiently as those from HEK293 cells.Interestingly, this RGEN had an off-target site with a single-basemismatch immediately downstream of the on-target site (FIG. 30 ). Theseresults suggest that RGENs can be used to detect naturally-occurringindels but cannot distinguish sequences with single nucleotidepolymorphisms or point mutations due to their off-target effects.

To genotype oncogenic single-nucleotide variations using RGENs, weattenuated RGEN activity by employing a single-base mismatched guide RNAinstead of a perfectly-matched RNA. RGENs that contained theperfectly-matched guide RNA specific to the wild-type sequence or mutantsequence cleaved both sequences (FIGS. 31 a and 32 a ). In contrast,RGENs that contained a single-base mismatched guide RNA distinguishedthe two sequences, enabling genotyping of three recurrent oncogenicpoint mutations in the KRAS, PIK3CA, and IDH1 genes in human cancer celllines (FIG. 29 b and FIGS. 33 a, b ). In addition, we were able todetect point mutations in the BRAF and NRAS genes using RGENs thatrecognize the NAG PAM sequence (FIGS. 33 c, d ). We believe that we canuse RGEN-RFLP to genotype almost any, if not all, mutations orpolymorphisms in the human and other genomes.

The above data proposes RGENs as providing a platform to use simple androbust RFLP analysis for various sequence variations. With highflexibility in reprogramming target sequence, RGENs can be used todetect various genetic variations (single nucleotide variations, smallinsertion/deletions, structural variations) such as disease-relatedrecurring mutations, genotypes related to drug-response by a patient andalso mutations induced by engineered nucleases in cells. Here, we usedRGEN genotyping to detect mutations induced by engineered nucleases incells and animals. In principle, one could also use RGENs that willspecifically detect and cleave naturally-occurring variations andmutations.

Based on the above description, it should be understood by those skilledin the art that various alternatives to the embodiments of the inventiondescribed herein may be employed in practicing the invention withoutdeparting from the technical idea or essential features of the inventionas defined in the following claims. In this regard, the above-describedexamples are for illustrative purposes only, and the invention is notintended to be limited by these examples. The scope of the presentinvention should be understood to include all of the modifications ormodified form derived from the meaning and scope of the following claimsor its equivalent concepts.

REFERENCES

-   1. M. Jinek et al., Science 337, 816 (Aug. 17, 2012).-   2. H. Kim, E. Um, S. R. Cho, C. Jung, J. S. Kim, Nat Methods 8, 941    (November, 2011).-   3. H. J. Kim, H. J. Lee, H. Kim, S. W. Cho, J. S. Kim, Genome Res    19, 1279 (July, 2009).-   4. E. E. Perez et al., Nat Biotechnol 26, 808 (July, 2008).-   5. J. C. Miller et al., Nat Biotechnol 29, 143 (February, 2011).-   6. C. Mussolino et al., Nucleic Acids Res 39, 9283 (November, 2011).-   7. J. Cohen, Science 332, 784 (May 13, 2011).-   8. V. Pattanayak, C. L. Ramirez, J. K. Joung, D. R. Liu, Nat Methods    8, 765 (September, 2011).-   9. R. Gabriel et al., Nat Biotechnol 29, 816 (September, 2011).-   10. E. Kim et al., Genome Res, (Apr. 20, 2012).-   11. H. J. Lee, J. Kweon, E. Kim, S. Kim, J. S. Kim, Genome Res 22,    539 (March, 2012).-   12. H. J. Lee, E. Kim, J. S. Kim, Genome Res 20, 81 (January, 2010).-   13. Fu Y, Foden J A, Khayter C, Maeder M L, Reyon D, Joung J K,    Sander J D. High-frequency off-target mutagenesis induced by    CRISPR-Cas nucleases in human cells. Nat Biotech advance online    publication (2013)

1-57. (canceled)
 58. A method of modifying a target endogenous nucleicacid sequence, wherein the target endogenous nucleic acid sequence is ina nucleus of a eukaryotic cell, comprising: introducing the Cas9/RNAcomplex into the eukaryotic cell, wherein the Cas9/RNA complex comprisesa recombinant Cas9 protein and a guide RNA, wherein the Cas9/RNA complexis a combination of the recombinant Cas9 protein and a guide RNA,wherein the guide RNA includes a crRNA and a tracrRNA, wherein the guideRNA is transcribed in vitro or synthesized chemically, wherein thetarget endogenous nucleic acid sequence includes a portion complementaryto the crRNA of the guide RNA, and wherein the combination of therecombinant Cas9 and the guide RNA produces a modification of the targetendogenous nucleic acid sequence in the nucleus of the eukaryotic cell.59. The method of claim 58, wherein the Cas9/RNA complex is introducedto the eukaryotic cell by transfection, wherein the transfection isperformed by the method selected from the group consisting ofmicroinjection, electroporation, DEAE-dextran treatment, lipofection,nanoparticle-mediated transfection, protein transduction domain mediatedtransduction, virus-mediated gene delivery, and polyethylene glycol(PEG)-mediated transfection of protoplasts.
 60. The method of claim 58,wherein the Cas9/RNA complex is introduced to the nucleus of theeukaryotic cell by the transfection.
 61. The method of claim 58, whereinthe guide RNA is (i) a dual guide RNA comprising a crRNA and a tracrRNA;or (ii) a single-chain guide RNA comprising a crRNA fused to a tracrRNA.62. The method of claim 58, wherein the target endogenous nucleic acidcomprises a trinucleotide protospacer adjacent motif (PAM) recognized bythe recombinant Cas9 protein, wherein the PAM consists of trinucleotide5′-NGG-3′.
 63. The method of claim 58, wherein the recombinant Cas9protein comprises a nuclear localization signal (NLS), wherein the NLSis at N-terminus, the C-terminus, or both the N-terminus and theC-terminus of the recombinant Cas9 protein.
 64. The method of claim 58,wherein the crRNA is 20 nucleotides in length.
 65. The method of claim58, wherein the modification includes any one of deletion, insertion,substitution or indel of at least one of nucleotide.
 66. The method ofclaim 58, further comprising inducing divisions of the eukaryotic cellto be divided into a plurality of cells which include the modifiednucleic acid sequence.
 67. The method of claim 58, wherein thecombination of the recombinant Cas9 protein and the guide RNA isassembled by mixing and incubating the recombinant Cas9 protein and theguide RNA in vitro.
 68. The method of claim 58, wherein the guide RNA istranscribed in vitro or synthesized chemically.
 69. The method of claim58, wherein the recombinant Cas9 protein is purified from a bacterialsystem for a plasmid-free used to avoid host genome integration.